MATS4300 Analysis and X-ray tomography

Course information in Korppi
Instructor: Joonas Ilmavirta (firstname.lastname@jyu.fi, MaD310)
First period, fall 2017

Course material

The course is based on lecture notes. Exercises are included in the notes; there are no separate exercise sheets.

first version
second version
third version (last one to be used in the course)
latest version on arXiv (the file will not be updated on this page)

Completing the course

The course is completed by solving a sufficient amount of exercises and returning them at the end of the course. Each exercise is graded (0–2 points) and the course grade is calculated from the total points. The exercises must be given to the instructor (by email, in person, or in the mail slot) by Thursday, October 26, 2017. Later dates are possible, but this will delay grading significantly, and you should contact the instructor for details. Electronic submissions are preferred.

In addition to the normal points from exercises, one can earn bonus points:

3 points: Exercises neatly typeset by LaTeX and returned by email. (Pictures can be hand-drawn, scanned and in a separate file.)
20 points: Prepare a 22-minute talk alone or a 45-minute talk in a pair. For suitable topics, contact the instructor.
Varying: Bonus exercises in the lecture notes. An answer to a bonus exercise is worth 0–5 points, depending on the question and the answer.

There are a total of 125 exercises in the first version of the lecture notes, so 250 points are available. The limits for grades are as follows:

1: at least 125 points (this amount of points is required to pass regardless of any bonuses)
2: at least 157 points
3: at least 189 points
4: at least 221 points
5: at least 253 points

The limits can be decreased if adjustment is needed. They will not be increased. If more exercises are published, there is more to choose from but the limits will remain.

The weekly exercise sessions are a forum to discuss the problems and verify the answers. The solutions are not graded in or before the exercise sessions.

Each section ends in a question asking about confusing points, feedback, or any related questions. Mentioning one smaller thing gives one point, mentioning two or more things or something more substantial gives two points. You can also use questions you asked during the lecture. The answers to these special exercise problems must be given to the instructor at or before the relevant exercise session in writing. This ensures that any confusions can be cleared quickly. You can also include them in the final returned exercise set; it will help keep everything one place but will not affect grading.

Attendance has no effect on grading. The weekly compulsory exercises (see above) can be returned by email or to the mail slot. The only exception concerns the bonus points talk, which has to be given in person.

Schedule

There will be lectures and exercises as indicated in Korppi. However, there are slight alterations to the roles of the events. Each lecture (45 + 45 minutes) will cover a single section of the lecture notes, and each excercise session will cover two sections. No separate exercise sheets will be given; the problems are contained in the notes.

Preliminary plan:

Tuesday, September 5, MaD381, 2–4 pm: Lecture 1
Thursday, September 7, MaD380, 10–12 am: Lecture 2
Monday, September 11, MaD355, 12–2 pm: Exercise 1–2
Tuesday, September 12, MaD381, 2–4 pm: Lecture 3
Thursday, September 14, MaD380, 10–12 am: Lecture 4
Monday, September 18, MaD346, 12–2 pm: Exercise 3–4 (note the unusual place)
Tuesday, September 19, MaD381, 2–4 pm: Lecture 5
Thursday, September 21, MaD380, 10–12 am: Lecture 6
Monday, September 25, MaD355, 12–2 pm: Exercise 5–6
Deadline: Giving talks must be decided by Monday, September 25
Tuesday, September 26, MaD381, 2–4 pm: Lecture 7
Thursday, September 28, MaD380, 10–12 am: Lecture 8
Monday, October 2, MaD355, 12–2 pm: Exercise 7–8
Tuesday, October 3, MaD381, 2–3 pm: Student talks (see time)
Thursday, October 5, MaD380, 10–12 am: Lecture 9
Monday, October 9, MaD355, 12–2 pm: Lecture 10 (marked as exercise in Korppi)
Tuesday, October 10, MaD381, 2–4 pm: Lecture 11
Thursday, October 12, MaD380, 10–12 am: Student talks
Monday, October 16, MaD355, 12–2 pm: Exercise 9–10
Tuesday, October 17, MaD381, 2–4 pm: Lecture 12
Thursday, October 19, MaD380, 10–12 am: Lecture 13
Monday, October 23, MaD355, 12–2 pm: Exercise 11–12
Deadline: Solutions to exercise problems must be returned by Thursday, October 26 (negotiable; see above)

The talk schedule is as follows. Each talk is about 20 minutes long. The times are approximate, but talks will be interrupted forcibly after 25 minutes.

Tuesday, October 3
14:15–14:35	Covi	Riesz potentials
14:35–14:55	Hörmann	Radon's inversion formula
14:55–15:15	Railo	The attenuated X-ray transform
Tuesday, October 10
14:15–14:35	Zhu	Fourier transform
14:35–16:00	Lecture 11
Thursday, October 12
10:15–10:55	Rasimus & Uusluoto	Distributions
11:15–11:35	Mönkkönen	Fourier series

Questions and answers

Here are your questions and my answers to them. Some of the comments were typos, and they will be corrected in future versions of the notes. They are not included here but are much appreciated. $\newcommand{\R}{\mathbb R}\newcommand{\C}{\mathbb C}\newcommand{\Z}{\mathbb Z}\newcommand{\T}{\mathbb T}\newcommand{\N}{\mathbb N}\newcommand{\xrt}{\mathcal I}\newcommand{\ft}{\mathcal F}\newcommand{\abs}[1]{\lvert #1 \rvert}\newcommand{\A}{\mathcal A}\newcommand{\der}{\mathrm{d}}\newcommand{\dd}{\,\der}$

Section 1

Why are we only interested in injectivity (unqiueness of solutions)? Why not also existence? Existence of solutions is also important in practice: If we have some data, does there even exist a system (within our model) that could produce it? Surjectivity depends on the choice of the target space. The question therefore is to characterize the range of the forward operator. For example, what is $\xrt(C_c(\R^n))$? I chose to leave such characterizations out of the course.
Why only left inverses? If the target is shrinked to be precisely the range of the forward operator, then a left inverse is automatically a right inverse as well. However, a left inverse is the thing one needs in practice; it is the thing you need to analyze your data.
I can see why we take $X$ to be the set of compactly supported functions, as they are supposed to represent a physical object. But is it not restrictive to use only continuous functions? The attenuation may jump across an interface. Yes, non-continuous attenuations are physically relevant. We restricted the definition to continuous functions for technical convenience. Other conditions that guarantee integrability over all lines are easily quite clumsy to state. It is possible to define the X-ray transform for far more singular objects, where integrals are only well defined over almost every line or in a distributional sense. There are exercises where the X-ray transform is taken of a characteristic function although it was outside our function spaces. This was purposeful; the definition is natural, and observing this (non-)issue was an implicit part of the exercise.
When we compute the X-ray transform using the redundant parametrization for the lines in $\R^n$, we should have the resulting function should be inavariant under translation of $x$ along the line, right? That is, if we change $x \leadsto x+sv$, $\forall s\in\R$, the X-ray transform should not change. This does not appear to be the case in exercise 9. There was indeed a typo in the exercise. This is an example of range characterization; the invariance property is indeed necessary for a function to be in the image of the X-ray transform.
In exercise 2 is it meant to apply the Beer-Lambert law (ray of light) to the supposed X-ray beam along the line? Yes, the law is valid along every line.
Could one similarly as expressing lines not passing through the origin in $\R^2$ also express hyperplanes in $\R^n$ by the closest point to the origin? Yes, this is typical in the analysis of the Radon transform. It was mentioned in the lectures that all lines in the plane can be characterized by the closest point in "extended polar coordinates" $(r,\theta)\in[0,\infty)\times S^1$ where we have "several directed origins". The same thing happens with hyperplanes in $\R^n$ and $[0,\infty)\times S^{n-1}$.
The topic of the course was restricted to mathematics. This was described to be a part of a bigger picture, where one first models a physical problem, then solves a mathematical problem and finally interprets the result. But how does it fit into this picture that we often know how to interpret results way before solving the mathematical problem? The steps don't have to be taken in order. We do indeed often know how to interpret the results before they are obtained, and one only has to do a sanity check (results look realistic) afterwards. Much of the interpretation comes automatically from the modelling step.
In the first item of the fourth list in section 1.1 continuity of the functions is not postulated, right? But of course continuous functions make more sense regarding our goal of stability. To make the question precise, I will add the continuity assumption. However, stability is an interesting and reasonable question even when the functions are not continuous. The X-ray transform is a continuous operator (between suitable function spaces) even when the functions themselves are not continuous. Looking for a continuous inverse has nothing to do with continuity of the reconstructed function itself.
Where does the Radon transform physically come from? Or is it just another method for doing nearly the same thing as X-ray tomography? The X-ray transform is far more common im applications. I don't remember any specific examples where the Radon transform would appear directly. It is partly popular for historical reasons, partly due to its relation to X-ray tomography (exercise 7).
Can you give a brief summary of the tools used for different injectivity proofs? This is actually exercise 121. The five proofs are in sections 3, 4–5, 6, 7–9, and 12. Broadly, the tools are Fourier analysis, functional analysis, partial differential equations, and (differential) geometry.

Section 2

Why is the target space of the Fourier transform $\ft\colon L^2(\R/2\pi\Z)\to\ell^2(\Z)$ a space over the discrete set $\Z$ and not the whole $\R$ (ie. $L^2(\R)$ instead of $\ell^2(\Z)$)? Does this have to do with the inverse problem or the physical setting? This has nothing to do with the specific problem, it is a mathematical property. A function cannot be $2\pi$-periodic unless all frequencies are integers. To make this statement more rigorous, one can show that the Fourier transform (in the sense of whole $\R^n$, not $\T^n$) of a periodic function is a distribution supported on the lattice. Another way to see this will come in section 11 when we discuss the Fourier transform in greater generality. The fact that only discrete frequencies are possible is not obvious at first. It is a key result in Fourier analysis that is seldom stated explicitly.
In the definition of $\ell^2(\Z)$ it would seem more logical to specfiy the inner product first instead of the norm. True. The inner product was left implicit on purpose; in simple cases like this one can figure out the intended inner product by staring at the norm. (Of course one can use the polar formula, but it is unnecessarily clumsy here.)
Will there be more content on the lecture notes that was discussed in the lectures or some specific references in the later versions? Yes, there will be some more content in later versions. But the notes will focus on the key content, and lectures will also give additional remarks. I will also compile a list of references to the end of the notes.
Could Fourier series be defined also in the case of non-flat tori? What difficulties one would encounter in a such task? What role would a metric tensor play? There are two ways to go about this: Either forget about the metric and use the Fourier series on a flat torus regardless of the metric, or use the eigenbasis of the Laplace–Beltrami operator of the non-flat metric. Such a series is possible, but I am not aware of any uses in X-ray tomography.

Section 3

Are geodesics on the flat torus $\T^n$ all of the form $\gamma(t)=q(x+tv)$ for $x\in\R^n$ and $v\in\R^n\setminus0$? Yes, these are all the geodesics. Geodesics (as solutions of the geodesic equation, not as globally length-minimizing curves) are a local thing and the quotient map $q\colon\R^n\to\T^n$ is a local isometry. That is, if the flat metric on $\T^n$ is $g$, the Euclidean metric on $\R^n$ is $q^*g$. A local isometry always maps geodesics to geodesics.
Do the geodesics on the flat torus correspond to geodesics on the donut? No. Geodesics on the donut are different. With the usual identification there are common geodesics, but not all. The donut has less symmetry.
Why are we only interested in periodic geodesics? By exercise 21, a periodic geodesic is essentially a function $\R/\Z\to\T^n$. When you integrate over such a thing, you integrate over a compact set $\R/\Z$ (essentially $[0,1]$), and continuous functions are famously integrable in compact sets. If you take a non-periodic geodesic, you would need to integrate over all of $\R$. Since the geodesic can loop over and over on the support of the function $f\colon\T^n\to\C$, you typically end up with something non-integrable. Perhaps one could somehow redefine the integral to make sense of it, but it is tricky. A periodic geodesic is a closed subset of the torus, homeomorphic to $\T^1$. The closure of a non-periodic (= non-closed) geodesic is homeomorphic to $\T^k$ for some $1< k\leq n$. (Integration over such things would constitute a Radon transform on the torus.) The restriction to periodic geodesics is completely artificial from the point of view of our original X-ray tomography problen in the Euclidean space, but very convenient technically on the torus.
What is a curved torus like, compared to the flat torus? In a word: different. There is less symmetry. The Fourier series (no matter how defined, see questions to the previous section) does not play as well with the X-ray transform. There are areas of positive and negative curvature. The dynamics of the geodesic flow are different in different places.
Is a geodesic on a torus still such, even if it loops more than one time on itself? Yes. (It depends on what one means by a geodesic, but yes for us.) In our problem it doesn't make any difference. One can check that for example $\xrt_v=\xrt_{2v}$.
We have seen in exercise 27 that $\forall k\in\Z^n\exists w\in\Z^n\setminus0:k\cdot w=0$ works only for $n>1$, and this was said to be the reason why X-ray tomography does not work in 1D. I see that in this case the behaviour is different, but how does this result follow? All the rest of the auxiliary results in section 3 are valid for $n=1$. The only part of the proof that fails in 1D is in this exercise. Using the tools of the section, it's not hard to see that in 1D the X-ray transform of $f$ determines $\hat f(0)$ but nothing else. But this was to be expected: $\hat f(0)$ is the integral of $f$ over the only closed geodesics.
Is there any injectivity results on the geodesic X-ray transform that apply to non-flat tori? Or is there examples of non-injectivity (which you are aware of)? I am not aware of positive or negative results for a non-flat torus. An almost flat torus should be fine with an approximation argument, but I haven't seen it proven. Also the case of the donut should be tractable.

Section 4

It seems that we are assuming more data than we are using: we know the integrals over all lines, but we make no use of the lines through the origin in our inversion process. If the inversiononly works with these lines, how does one practically exclude the unnecessary data? That is, how do you know which lines go through the origin? The methods of proof used in this course are not the actual methods of reconstruction in practical use. Some of our methods can be implemented, but thre are still several practical considerations that we ignore completely. In theoretical results, it is quite common to throw away some data. It is sometimes convenient to look at some nice subset of data and forget the rest. This idea is not prominent in this course, but with Calderón's problem one often ends up using only so-called complex geometrical optics (CGO) solutions instead of all solutions to the relevant PDE. It would be possible to write the theorems assuming less data, but that makes the statements less clear.
How many lines are needed so that a function $f$ can be determined from its integrals over these lines within reasonable error? That would depend on the definition of reasonable error. No finite set of directions (or in particular, finite set of lines) is enough. I am not sure whether there is any quantitative analysis on how the error depends on the set of lines. Such things are usually most sensible when there is some a priori knowledge of the unknown functions. For example, in medical imaging we know pretty exactly what we are supposed to see, and we are looking from deviations from a healthy situation. Partial data issues will be discussed in section 12.
First lemma 4.2 seems arbitrary when the angle $0$ appears, but this $\xrt f_k(r,0)$ is clear when thought of as the "X-ray transform" of $a_k$ depending only on $r$. Exactly. It is essentially the X-ray transform of a 1D function. There is no new information at other angles since $\xrt f_k(r,\theta)=e^{ik\theta}\xrt f_k(r,0)$. The lemma could be generalized to all angles, but that would not help us. (Admittedly, it might help us understand something, but it will not help us prove anything.)
Should $\xrt$ and $\partial_\theta$ be symmetric in remark 4.4 for the discussion and if yes, should this be clear? The general phenomenon is that any (differential or other) operator commuting with rotations has this specific structure. The operator need not be symmetric. I will rewrite the remark more clearly. The X-ray transform is not symmetric (on the torus it can be seen as a family of symmetric operators), and that is why we will consider its normal operator. The derivative is skew-symmetric, which is just as good for the result. The case of symmetric matrices was supposed to be a simple example of the same phenomenon.
Does angular Fourier series behave nicely at infinity ($r\to\infty$)? Would everything work just fine if one considers the Euclidean space with the origin removed? (I guess some decay is then required.) As long as the function is in $L^2$ (or some such space) on almost every circle, one can write the Fourier series. For example, the same series makes sense for any continuous function $\R^2\to\C$. Fourier analysis is done circle by circle, so global behavior is irrelevant. If the function is in $L^2(\R^2)$, then the series converges in that space.
Could it be useful sometimes to replace circles in the definition of angular Fourier series by another family of simple closed disjoint curves that cover the punctured disc? E.g. to get similar decomposition on simple surfaces. It might be useful, but it will not be anywhere near as nice. For the tools used in sections 4–5, rotation symmetry is crucial, and a typical surface does not have that symmetry. If it does, then the method does indeed work, and simplicity is not needed. I am not aware of a way to use our argument with a general foliation.
Are there higher dimensional analogs of angular Fourier series? Are angular Fourier series somehow related to the spherical harmonics? Yes, indeed, the natural replacement is an expansion in spherical harmonics. One can define the spaces $H_k=\{p|_{S^{n-1}};p\colon\R^n\to\C\text{ is a homogeneous polynomial of degree }k,\Delta p=0\}\subset L^2(S^{n-1})$. It turns out that these spaces are orthogonal and $L^2(S^{n-1})=\bigoplus_{k\in\N}H_k$. In 2D $H_k$ is spanned by $e^{\pm ik\theta}$. The dimension is two apart from $\dim(H_0)=1$. The spaces $H_k$ are also the eigenspaces of the Laplacian on the sphere. In higher dimensions the dimension $\dim(H_k)$ grows as $k$ grows.

Section 5

Could the flat torus have also a Helgason type support theorem? Or is there some simple reason why such result is impossible? No support theorem is known on the torus (or any other closed manifold). One problem is that it is avoiding an obstacle is hard. It seems that if an obstacle on $\T^2$ has interior, then for any given point there are only finitely many directions that lead to a periodic goedesic avoiding the obstacle. Depending on how one defines convexity, it can be argued that there are no convex sets on a torus apart from the whole space and the empty set. The reason why it is hard is that there are so few geodesics available. But that does not imply that the result is impossible. For all I know, it might even be true, but I have no idea how to prove it.
There were discussions on compactness and convexity of the set in the Helgason's support theorem. However, it was omitted what would happen to noncompactly supported functions. Is the decay requirement for Helgason's support theorem the same as in the injectivity of the ray transform of noncompactly supported functions? If I remember correctly, Helgason's support theorem requires more decay than injectivity. We will avoid decay conditions to keep things tidy, and therefore our functions are compactly supported.
Page 20, after formula (37): Why did we expect that we can pull out the factor $e^{ik\theta}$? Because of exercise 32 (rotation invariance of the X-ray transform)? It is because of exercise 34 (lemma 4.2). The $k$th Fourier component of the data should only depend on the $k$th Fourier component of the function $f$. The function $f_k(r,\theta)=a_k(r)e^{ik\theta}$ contains only the $k$th Fourier component, so we expect that $If_k(r,\theta)$ only contains the $k$th Fourier component. That is, we expect it to depend on $\theta$ only in the form $e^{ik\theta}$. The dependence on radius is more complicated, and that was the point of calculation of $\xrt f_k$ that lead to the generalized Abel transform.
Page 21: We look at the generalized Abel transform. Without having any experience with Abel transform — does there also exist a proof of the injectivity of the X-ray transform using the Abel transform? I read that there is a connection between Abel and Radon transform, so maybe there exists something. Suppose the unknown function $f$ is radial, that is, $f(x)=a_0(\abs{x})$. Then the X-ray transform is $\xrt f(r,\theta)=\A_0a_0(r)$. Therefore to recover the function $a_0$ one needs to invert the Abel transform $\A_0$. In the case of radial functions the X-ray transform reduces to the Abel transform. We need different kinds of Abel transforms to treat other Fourier components. It might be possible to reduce the more general problem to the radial one, but I remember such an idea only vaguely.
How strongly does the invertibility of Abel transforms depend on the properties of the Chebyshev polynomials? Not at all, really. The specific structure we have allows us to have an explicit inversion formula. For the injectivity result, very little is needed.
Is there a general theory for Abel transforms? Yes, operators of this kind are "well-known" as the popular phrase goes. However, finding the exact results one needs for a specific purpose is not always easy. In inverse problems one wants injectivity, not just continuity or other such mapping properties. If you want to learn more, see this paper and references given there. You can replace the Chebyshev polynomials with almost anything.
What would be an example of a non-convex set where Helgason's support theorem works? For example the union of a compact convex set and finitely many points. Single points are "removable" for the X-ray transform, as one can approximate the missing lines by other ones. In 2D one can only have very little non-convexity. In higher dimensions there is more room. For example, for a banana $\subset\R^3$ one can apply the two-dimensional support theorem and get the desired result; it is enough that the slices of the obstacle are convex.
Why should $K$ be compact in the support theorem? The theorem is true for the open disc as well. (But not for a punctured disc or square!) Taking closures doesn't change the problem much. The important feature is boundedness. If the obstacle is, say, a closed half space, then the data is certainly insufficient.

Section 6

Who found the inversion method for the X-ray transform using Riesz potentials (and when)? Do you know what was the original motivation of Riesz (or who ever began) to study such objects? The Riesz potentials are natural fractional differential operators, analogous to the fractional operators on the real line. $I_2$ is a solution operator to the Poisson equation, and the generalization to other $I_\alpha$ sounds like a reasonable thing to study. I don't know the original motivation for Riesz potentials, other than it was certainly not X-ray tomography.
Which of the inversion methods are the most commonly used in practical tomography? From the methods given in this course, the normal operator method is closest to practise. The actual practical algorithms are different in nature. Instead of an inversion formula, one typically uses a regularized fit to the data.
Which of the methods in this course result reconstruction formulas and which do not? The only one that gives no reconstruction formula is the one given in sections 7–9 using the Pestov identity.
Page 24, after Exercise 53: Of course, self-adjoint operators are nice. But why do we really need a self-adjoint operator? The adjoint serves as a way to post-process our data. The original data $\xrt f$ might be inconvenient to use for one reason or another, so we operate on it with some operator $A$ to produce "refined data" $A\xrt f$. We just want to find an $A$ so that $A\xrt$ is nice. The meaning of niceness depends on context. Experience shows that the adjoint is often convenient. Self-adjointness is not always used in itself; it is just that self-adjoint operators tend to be nicer than most operators.
Page 26, first paragraph: How does the replacement of $S^{n-1}$ through its antipodal quotient work? And therefore, what is an antipodal quotient? One can define a relation $\sim$ on the sphere $S^{n-1}$ so that $v\sim v$ and $v\sim -v$ but there are no other relations. In other words, this is an equivalence relation where every point $v\in S^{n-1}$ is in relation with its antipodal point $-v$. The equivalence classes have size two. One might say that $S^{n-1}$ contains all oriented directions. Similarly, the quotient space $S^n/{\sim}$ contains all unoriented directions; we have identified the two orientations of every direction. This quotient can be called the antipodal quotient because it is the quotient corresponding to identifying antipodal points. The quotient is best known as the real projective space. There is a measure $S'$ on this space (corresponding to the measure on the sphere). We could replace $\int_{S^{n-1}}\cdots\dd S(v)$ with $\int_{S^{n-1}/{\sim}}\cdots\dd S'(v)$ in the definition of $\mu$. This might be more elegant, but it would be an unnecessary layer of structure. Moreover, the choice of not taking the antipodal quotient aims at making the sphere bundle business more understandable.
What does the set $\Gamma$ of all lines look like? Is there a better definition? What are the topology, measure, and manifold structure? The set is quite hard to visualize. (Good descriptions are welcome!) The measure was described in the lectures; describing a measure is essentially equivalent with telling how to integrate continuous functions. For details, look up the characterization of the dual of compactly supported continuous functions. The topology is the quotient toplogy inherited from $\R^n\times S^{n-1}$. I have no nice description for the smooth structure. There should be a separate course on the geometry of geodesics at some point.
Why is $\xrt^*$ called the formal adjoint? We defined the adjoint of a continuous linear operator $A\colon E\to F$. We want to use $L^2$ structure on both sides, but the X-ray transform is not continuous $\xrt\colon L^2(\R^n)\to L^2(\Gamma)$ as we saw in an exercise. We use the $L^2$ inner products, but we only used $C_c$ functions to test the desired property, not all $L^2$ functions. This is why it is not the adjoint in the sense defined in the course.
When we defined $\mu$ on $\Gamma$ in formula (50), why can't we use the same hyperplane for all $v$? There are two issues. First, if $v$ happens to lie in the hyperplane, then you can't reach all the lines in the direction $v$. Second, the measure is not rotation invariant and exercise 57 fails. If the hyperplane was given as $w^\perp$ for some fixed $w$, one would need to multiply the Hausdorff measure with $\abs{w\cdot v}$.

Section 7

Are there any nontrivial examples of functions on the sphere bundle that integrate to zero? By trivial, I mean that the function is not generated by the symmetric part of differentials of $n$-forms nor sums/series of such objects. It is actually fairly easy to see that a function $f\colon S\bar\Omega\to\R$ integrates to zero over all geodesics if and only if there is $h\colon S\bar\Omega\to\R$ so that $f=Xh$ and $h|_{\partial(S\Omega)}=0$. If, say, $f$ is continuous, then $h$ is continuous and also differentiable along the flow so that $Xh$ makes sense classically. To see this, pick $h=-u^f$. The whole problem in tensor tomography (on manifolds with boundary) is to show that the potential $h$ has the correct form; its existence is far simpler.
What can be said of the kernel of the X-ray transform for functions on the sphere bundle? The previous answer gives an almost perfect characterization of the kernel. The only issue is that regularity is problematic at the tangential part of $\partial(S\Omega)$.
What is the definition of the sphere bundle over a manifold? Suppose $(M,g)$ is a Riemannian manifold. The tangent bundle is $\{(x,v);x\in M,v\in T_xM\}$. The sphere bundle is $\{(x,v);x\in M,v\in T_xM,\abs{v}_g=1\}=\{(x,v)\in TM;\abs{v}_g=1\}$. The sphere bundle is a subbundle of the tangent bundle, meaning that it is a "subspace fiberwise". Just as $TM$ is not typically $M\times\R^n$, the sphere bundle is typially not $M\times S^{n-1}$.
What does it mean that unit speed parametrization makes fibers compact? We will work with continuous functions, and they will be integrable over compact sets. Therefore it is convenient that $S\bar\Omega$ is compact. A fiber bundle is compact if and only if the base space and the fibers are compact. The geodesic flow as defined in the notes will actually work for any set $\R^n\times A$ for $A\subset\R^n$. To make the various derivatives work nicely, it helps if $A$ is somehow good. We want to include all directions to include all geodesics. Unit speed parametrization feels somewhat natural. Compactness of the bundle requires that $A$ is compact. Therefore it's natural to take $A$ to be the unit sphere. In our geodesic flow the velocity is $v\in A$, so unit speed correspnds to $A=S^{n-1}$. Compactness would hold just the same for any compact set $A$. Differentiation with respect to angle becomes awkward if $A$ is not a sphere, but the radius of the sphere makes no difference. If all our geodesics have constant speed $7$, that is perfectly fine.
Is there a way to imagine the sphere bundle so that the connotations "horizontal" and "vertical" make intuitive sense? These names have to do with the canonical way of drawing a fiber bundle, the base being horizontal and the fibers vertical. It is convenient to have adjectives meaning "in the direction of the base" and "in the direction of a fiber" on a bundle.

Section 8

Is the Santaló's formula related to the Stokes theorem? Can we use idea that "$g = dp$", where the $p$ is the inner integral of the right hand side (with suitable derivative $d$)? Of course, somehow it seems that the Santaló's formula is only a change of variables formula / Fubini's theorem. Santaló's formula is just a change of variables. It is a useful one at that. I would not regard it as a Stokes theorem (an integration by parts formula) since no derivatives are involved. One can use Santaloó's formula to integrate by parts on the sphere bundle. The point of the formula is this: To integrate over the sphere bundle is the same as to integrate over every (lifted) geodesic separately and then integrating over the space of all geodesics (now identified with $\partial_{in}(S\Omega)$.
Page 40, 8.2: Why do we consider commutators? Is that something we will need in section 9 to complete this proof? And are they useful in general? To calculate with differential operators, we need a couple of basic tools. We need to be able to integrate by parts and change the order of differentiation. Integration by parts comes in the next section, and now we will study what happens when the order of differentiation changes. In our situation the order of differentiation does matter, but it only matters to a lower order, so to say. The effect of changing the order is captured by commutators. We will need lemma 8.1 in the proof, and proving such lemmas requires commutators.
Is there a standard procedure to reduce the commutator of every combination of $X,V,X_\perp$ of order $n$? I believe this would always reduce to a single product of order $n$, as in lemma 8.1. Am I right? That is not the case. The general rule is as with any differential operators: the commutator of operators of orders $m$ and $k$ tends to be $m+k-1$. For example, $[X^2,V^2]=XX_\perp V+XVX_\perp+X_\perp VX+VX_\perp X$, which can be written in various different forms using the commutator formulas, but the order is inevitably three. In special cases the order can be lower. For a silly example, $[X^7,X^7]=0$. The second order operators $XV$ and $VX$ only differ in the first order, so their commutator has one order less than initially expected. The general method for simplifying commutators is straightforward: use the rules for commutators of products, and when you see two operators that have the same operators but in different orders, commute their orders until they coincide.
What is the Sasaki metric? Is it only useful on 2D manifolds? How to proceed in higher dimensions? The Sasaki metric is useful on manifolds of any dimension. In 2D it can be defined by a shortcut, by defining $X$ and $V$ and then letting $X_\perp=[X,V]$ and declaring the three vector fields to be orthogonal. In higher dimensions more effort is needed. For more details on the Sasaki metric, you can take a look at the notes from the inverse problems reading group, this page in particular.

Section 9

Page 47, 9.2: Obviously, our PDE is neither elliptic nor hyperbolic or parabolic. But are there more PDEs of the same form for which one could use our results? And do they have interesting applications? I am not aware of other similar PDEs. One might regard the equation as a sub-Riemannian wave equation, but I am not familiar enough with PDEs in such geometry.
The Pestov identity gives uniqueness. Does it give a formula for reconstruction? No, not directly. In principle it gives an algorithm: solve the PDE with the boundary data coming from the X-ray transform and the calculate $f$ from the solution $u^f$ of the PDE. But this leaves the implicit step of solcing the PDE. There are reconstruction formulas based on the sphere bundle, but they are more involved than the ones we have seen on the course.
What are necessary conditions for a manifold to ensure that the X-ray transform is injective? This is not known, and trying to answer this question actually constitutes an active field of research. At least if the manifold is simple (compact manifold, strictly convex boundary, no conjugate points), then injectivity is known. There are numerous other results, too, and any one of the three assumptions can be relaxed in some cases. Presumably the geodesic X-ray transform is injective if the manifold is compact and non-trapping, but no proof is known.
Is the Pestov identity valid for less regular functions? Yes. The identity given in the lectures should be true for $u\in H^2_0(S\Omega)$. However, the regularity theory of the Pestov identity (especially concerning the regularity of the underlying geometry) has not been pushed far, and there is room for research.
How would the Pestov identity work if we do not assume that $u$ is compactly supported in Omega? In fact, it is enough that $u$ vanishes at the boundary. Compact support was assumed merely for technical convenience. Otherwise one would need to prove boundary determination to improve the regularity of $u^f$.

Section 10

How is X-ray tomography generalized from vector fields to tensor fields of higher order? Are there uniqueness results. This is covered briefly in section 13.2 of version 3.
What else is the X-ray transform of a vector field called and why? It is also known as the Doppler transform. The physical example given in section 10.2 is the Doppler effect: motion of the medium affects the speed of sound.
What does the Hodge decomposition mean? The decomposition states that (under suitable assumptions) a differential form can be uniquely written as a sum of an exact form, a coexact form, and a harmonic form. In $\R^3$ (an with compact support) there are no harmonic forms. An exact form is a gradient and a coexact form is a curl, so this gives the familiar Helmholtz decomposition. There is not enough context for this decomposition on this course.
In the application of section 10.2, how can $c(x)$ depend on the position $x$? What is the physical meaning? Here are three possible causes for position-dependent speed of sound: (1) The material is solid (or very viscous) so that flow does not immediately destroy any structures originally present. (2) There is an external force such as gravity. For example, the speed of sound varies with depth in oceans and the atmosphere. (3) If the liquid is compressible, then compressed areas will typically have larger sound speed. In this case both pressure and speed of sound depend on time and position.
How is the fact that the fluid is incompressible going to help? It is surprisingly convenient, but this turns out to be precisely the requirement for uniqueness. I don't know a physical reason for why this should help, but I would be glad to learn.
If we assume that we know the solenoidal injectivity for all orders of tensors, is there a general procedure and statement similar to the Exercise 102 (where one studied data of a function + a vector field)? The same argument works for the sum of two tensor fields of orders $m$ and $m'$, provided that $m+m'$ is odd. Otherwise the result is false.

Section 11

On how general topological groups can one do Fourier theory? Are there any examples besides $\R^n$ and $\T^n$? Are they useful or just curiosities? In this course we have seen three locally compact abelian (LCA) groups: $\T^n$, $\Z^n$, and $\R^n$. A product of LCA groups is an LCA group, so the general theory also covers Fourier analysis on groups like $\Z^3\times\T^2\times\R$. Another sometimes useful is given by finite abelian groups. A finite abelian group is a product of cyclic groups, and the Fourier transform on a finite cyclic group is known as the discrete Fourier transform. Wikipedia gives a number of details and applications. Fourier analysis on non-abelian finite groups or compact Lie groups is also quite useful, but it is more representation-theoretic than Fourier-analytic in nature. For more general (and less useful) examples, you can consider any abelian group with the discrete topology or an infinite-dimensiona torus $\T^\kappa$ for any (index set or cardinal) $\kappa$.
Is accumulation of zeroes enough to guarantee that a real analytic function vanishes? At least in complex analysis the zero set of an analytic function is either discrete or the whole space. No. For example, the function $f\colon\R^2\to\R$, $f(x,y)=y$, is real analytic and doesn't vanish identically, but the zero set $\R\times\{0\}$ does have accumulation points. The problem is that the zeros only accumulate from one direction. In complex analysis, there is only one direction (up to multiplication by an invertible complex number), so there is no such issue. If the zero set of a real analytic function accumulates from all directions in a suitable sense, then it has to vanish.
What is the Haar measure? A Haar measure on a locally compact abelian group is a Radon measure (Borel, locally finite, and regular) on the group which is translation invariant. Every LCA group has a Haar measure and it is unique up to a multiplicative constant.
What came first, Fourier series, Fourier transform, or the general abstract theory we saw in the beginning of section 11? As always, the great and beautiful general abstract theory came last. Fourier series is simpler than the transfrom, and it came before. As usual, the theory was developed gradually.
In Wikipedia there is an article on the Paley–Wiener theorem: There is something called Schwartz' Paley–Wiener theorem which seems to be stated for holomorphic functions. Are the complex and real analytic versions related (can one e.g. prove one from the other one)? The real analytic version follows easily from the complex analytic one: the restriction of a holomorphic function on $\C^n$ to $\R^n$ is real analytic. In the other direction, a real analytic function can be extended to a holomorphic function at least within the radii of convergence (see exercise 110) at different points on $\R^n\subset\C^n$. Many versions of the Paley–Wiener theorem describe how the Fourier transform behaves when the imaginary part grows. Our version has no such control, and we specifically tried to avoid using complex analysis in several variables.

Section 12

What is the smallest requirement for $D$ or $D^\perp$ so that injectivity holds? It depends on the function space. For compactly supported functions, it is sufficient and necessary that $D$ is infinite. In the Schwartz space it is sufficient and necessary that $D^\perp$ is dense.
What does microlocal analysis mean? Microlocal analysis studies generalized differential operators (pseudodifferential operators and Fourier integral operators) and fine details of singularities. It is hard to give a more insightful answer anywhere near the framework of this course, as setting up and motivating microlocal analysis takes a while. This is currently being covered in the inverse problems reading group, and there are some good introductory materials.
Are fractional Laplace operators the only way to invert normal operators? Depends on the function spaces. If you make the target of the normal operator so small that it becomes surjective, then there is only one inverse. That one inverse happens to be a fractional Laplacian in the case of X-ray and Radon transforms. The inverse operator is free to do anything on the orthogonal complement of the image (or on the cokernel), with some function spaces there are several inverse operators. However, one might argue that the inverses are essentially just fractional Laplacians anyway, because they will have to act exactly the same way on anything in the image of the normal operator.
Can one do a local reconstruction for the Radon transform in even dimensions? If not, why so? At least it is not possible to do local reconstruction from the normal operator $R^*R$ in even dimensions. The resulting Riesz potential has an essentially unique inverse (see previous question), and that inverse operator is non-local. However, this alone does not mean that one couldn't do local inversion for $R$ without the adjoint. I am not aware of a proof that local inversion is impossible, but I would like to see one.
Is there any easy way to see what is in kernel of local reconstruction operator? No. If there is a way, I would be happy to know.
How one could prove stability results under suitable conditions, what are the main ideas? In a general framework (a general family of lines as opposed to a speficic geometrical setting like imaging slice by slice) the only viable approach seems to be microlocal analysis. One needs to show that the data recovers all singularities. If one has injectivity on smooth functions (which may be used with different methods; microlocal analysis is often weak at this), then recovery of singularities implies stability. However, there is no general classification of families of lines which give injectivity. Once one has injectivity (in the smooth setting), then stability follows more easily with the microlocal machinery.
Is not the data for the X-ray transform always partial, as we can only use finite amount of rays? Yes, in practice we only ever measure over a finite number of lines, and in theory that leaves a huge kernel. In practice this is not a huge problem. What full data means in a practical situation is that there are no geometrical restrictions: we can measure the integral over any line we want. Of course we are only granted a finite number of wishes, but those wishes are not constrained. In practice full data means lack of restrictions to sampling, not having data over every single line.
Are there some other interesting partial data problems for the X-ray transform which we did not covered in the lectures? For example ones that are encountered in applications or just theoretically difficult questions. Here is an example: Given a set $A\subset\R^n$, do the integrals over all lines that meet $A$ determine $f\colon\R^n\to\R$ uniquely? What if $f$ is supported in a set $B$? It is common that the measurement device is restricted to move within a limited set.

Section 13

Does integral geometry (cf. differential geometry) mean that one attempts to recconstruct a function from its integrals? Integral geometry can refer to different things, but in the context of inverse problems it means precisely that: trying to recover a function from its integrals. There are many variations on this problem, including non-linear ones.
Does integral geometry have anything to do with differential geometry, in some dual sense? Differential and integral geometry are not dual concepts like differentiation and integration are. Some integral geometry problems are posed in a differential geometric setting (example: integrals over geodesics on a manifold), but the similarity (or apparent duality) of the names is mostly coincidental.
What generalizations can one do in addition to geodesics in Riemannian geometry and do they have applications? Some problems concerning relativity or wave propagation give rise to integral geometry problems in Lorentzian geometry. Some problems in elasticity lead to integral geometry problems in Finsler geometry. Different kinds of reflections and refractions (and splittings) arise physically when waves meet interfaces. Sometimes the geodesics are "unphysical". For example, consider a material with different lattice orientations. Orientations can be described by the rotation group $SO(3)$. The distribution of different orientations is a function (or rather, a measure) on this group. Measuring the diffraction with a fixed initial direction and a fixed measurement direction gives the integral of this unknown function over a geodesic in $SO(3)$. The distribution can be recovered from such measurements because the X-ray transform is injective on $SO(3)$.
What are examples of problem where X-ray transform pop up naturally? Actual X-ray tomography and linearized travel time tomography were discussed in the notes. In some cases Calderón's problem can be reduced to an X-ray transform problem. Geodesic X-ray tomography plays an important role in many spectral rigidity results (which tell that isospectral deformations are isometric).
What would be examples of interesting open problems in field of X-ray transforms? Here are some examples: If a bounded continuous function $f\colon[0,1]\times\R\to\R$ integrates to zero over every line through the strip, then is $f(x,y)=h(x)$ for some function $h$ with $\int_0^1h=0$? Is the X-ray transform injective on any compact non-trapping Riemannian manifold with strictly convex boundary? Which sets of lines are sufficient for injectivity of the X-ray transform?
Is it the same to say that all maximal geodesics have finite length and that $M$ is non-trapping? Yes. If a maximal geodesic has infinite length, then it is trapped within the manifold in at least one of the two directions. If all maximal geodesics are of finite length, then nothing is trapped inside.
What is an example of non-simple manifold? A subset of a sphere which is strictly larger than a hemisphere.
It is possible to prove injectivity of the X-ray transform on the plane using a layer stripping method and local analysis near the boundary based on the Helgason's support theorem (we basically proved injectivity that way in the angular Fourier series method). Are there other ways to argue in the similar fashion without using directly the support theorem (or the same ideas we developed for it)? For example based on the other methods we used in this course. I'm not aware of methods like that which don't directly use a support theorem. Proofs using a foliation condition tend to rely on support theorems.