
Statistician Giles Hooker turns complex data into insights
​​
Talking to Prof. Giles Hooker is always a pleasure and something I was looking forward to ever since we scheduled his interview last December.
Born in Canada to Australian parents, Giles resides in Philadelphia (USA) where he conducts research and teaches at the Department of Statistics and Data Science at the University of Pennsylvania. Giles has a unique take on datasets no matter where they come from and asks questions which make people assess their work from a different angle. He is always supportive and lends a helping hand and it makes him very happy when people approach him before they acquire their datasets. :-) Giles also deeply cares about his students and ensures they have a clear future career path ahead of them.
In his free time, Giles enjoys cooking the most delicious meals, playing the viola and exploring the ocean through scuba diving with his wife Éilís.
I sat down with Giles to hear all about his interesting life and impressive career.
​
​​​
Please tell us about yourself and your roots. Where did you grow up?
​
I am complicated. My family is all Australian, but my parents moved to Canada before I was born. I spent the first few years of my life in Canada before returning to Australia. My father was an academic, so we went back to Canada every sabbatical. The notion that I have a hometown just does not seem to fit because I have many; my Canadian home is London, Ontario and my Australian home is Newcastle but I’ve also lived for long periods in San Francisco, Montreal, Ithaca and now Philadelphia.
​​
​​
When did you first get exposed to science and when did you decide to become a scientist?
​​​
I do not think I ever consciously made that decision. My dad was in philosophy, but he was originally in physics and my rather hippy parents gave me chemistry sets and other sort of science-based and educationally enriching playsets. They were a lot of fun, but I do not know how seriously I took them. I ended up in math, surprisingly; I was really interested in politics but also wanted to do sciences. I found an arrangement that would let me do that without doing a lab science, which I am terrible at. I ended up just by chance -- and in fact almost all of my life has been just by chance -- at the best place to study math in Australia and with the best mathematics students around. Then at some point during my PhD, I had to decide if I was going to stay in academia.
In terms of academia versus industry, I am there to support my students’ fulfillment and industry needs well trained researchers, too. A lot of academics feel that students going into industry is a waste of training -- it is certainly the case that you get more out of students who go into academia in terms of getting papers out and future collaborations but thinking of your academic career as being there to build up a profile is the wrong way to approach being in science.
​​​
​
How did your educational path look like?
​
I started off being interested largely in political science but also wanted to do something in the sciences, partly out of snobbery, but also, I liked the precision of what you can say in science. I managed to do a combined degree at the Australian National University where I got a Bachelor of Arts and Bachelor of Science for only one extra year of study. During that time, I discovered what I really liked was math and I was very interested in numerical analysis, namely how do you do good approximation methods on computers. At that time, the faculty who worked in numerical analysis became interested in data mining.
I spent a lot of time saying that “I might be studying mathematics but at least I am not a statistician”. These faculty members gave a topics course on data mining, and I thought “This is what statistics should be!”. However, I gradually discovered that all the people whose work I thought was cool were actually in statistics departments. Then there was a Fulbright scholarship that was specifically for people doing statistics and I thought I should apply for that. I ended up getting that scholarship and went at Stanford where Jerry Friedman worked and whose research I found particularly exciting.
Then I did something completely different for my postdoc. I had been in touch with Jim Ramsey at McGill University mostly because I really wanted to live in Montreal. He did super cool stuff too, so I went and spent two years with him, which was absolutely awesome. Then when I was on the job market for a faculty job, Cornell decided I was worth taking a risk on and so I turned up in the US again and spent about 16 years on their faculty.
It also turned out that one of the people who I would not normally have taken classes with, except for the fact that I started getting interested in data mining, with this guy called Peter Hall. I had no idea that he was a really big deal in statistics, and he wrote letters of recommendation for me. I am sure that is part of why I got into Stanford and that was entirely by happenstance. It was not something I planned or did anything other than luck into.
​​
​
Who was/is your biggest supporter?
​
I think the best answer to that is my students. There have been many points in my life in which particularly senior faculty have gone to bat for me and I am enormously grateful. However, in terms of having people who trusted their futures to me, even when I left their institution and still stuck around to keep working with me and give their honest opinion, I think that is an enormously privileged position to be in.
​​​
​​
You are a Professor of Statistics and Data Science at the University of Pennsylvania. Please tell us a bit about your research foci: machine learning, functional data analysis, differential equations, computational statistics and statistical ecology?
​
When people ask me what do, I say “too much” or that I have research ADD. I think I am bad at letting go of areas. For example, I keep on saying I do not want to do functional data analysis anymore, but people keep bringing me problems and I think “OK there’s something interesting to do there” – you might say I have trouble saying no.
I tend to work on complex models for systems. For example, you think about physical models that people build for epidemics or how species interact and compete. How do you describe those dynamics when often they are on time scales you cannot see particularly well? This is what applied mathematicians do all the time. Then I ask how we can use that data to inform the models or to work out what data we should collect or what we should do to make the model work better. Most recently I have been interested in work on biomechanics, which is a similar sort of process. I get high frequency measurements of somebody's movement, which is a fairly complex process, and it is different every time. How do I describe the variability? I want to describe the dynamics of that process, basically backing out to ask what are the modes of control processes that somebody is putting in. A lot of statistical research is on increasingly more complicated data; I tend to work with more complicated models.
I run a consultant class for our PhD program where we solicit projects from around the university. Sometimes if I could have talked to these groups before they started collecting data, I might have designed things a bit differently. A quote from one of the great early statisticians was that somebody brings you data after they have just done the experiment, you might be able to perform a postmortem statistical analysis.
There are times when case studies are important, and this is where the obsession with statistics is not always helpful. You must decide beforehand what your data is going to be, what you are going to collect and how you are going to run an experiment. How do you decide that without first getting some anecdotal evidence that gives you an idea about what you think is going on.
If you want to do a little bit of everything – the phrase is “play in everyone else’s backyard” -- like ecology or biology, the discipline that lets you do that is statistics. That comes with certain responsibilities; to ensure the statistics is honest, but also to learn about the subject area. I always comment that I work with people whose disciplines I know absolutely nothing about, and I do not get nearly enough pushback about saying completely stupid things.
​​​
​
What methods do you and your team apply to pursue your research foci?
​
My research is developing statistical methods and statistical tools. That is usually a couple of years after they would have been useful for the collaborators I had at the time because it takes some time to get this out and because really novel statistical methods are actually a distraction, and the community where they would be applied then gets lost in the method.
We have these complex models or problems that we need to produce solutions for, and my job is to come up with things that I think are interesting to do with that data and then to demonstrate that it gives you sensible results. A lot of what that means is either math or computing. We spend some time trying to prove that it will do useful things mathematically and also run large scale simulations where we are generating data where we know what we should find. Then we ask if the statistical guarantees that I am giving actually pan out in practice.
​​​​
​
What advice would you give people who want to follow in your scientific footsteps?
​​​
Firstly, be sure that is something you find exciting. I do not see any point in an academic career except insofar as you are having fun. The second thing is learning more math. There is no such thing as knowing too much math. I usually told my undergraduate students at Cornell to do the hardest math that does not make you hate it. Everybody has a limit, but you will get more by doing more math; but don’t forget the fun bit, too.
It is amazing how many math teachers I think have math phobia. They are taught this is the method for how you do something and do not creatively work through how this is a problem that you solve. If you just teach it as rigid rules without meaning behind them, then many people ask why they should care.
​​​​
​​​​​​​​
Who, what, when, where & why?
​​​​
​​​​​
Who?
- would you like to conduct research with if you had the chance?
Leo Breiman was the guy who produced random forests [an ensemble learning method for classification, regression and other tasks that works by creating a multitude of decision trees during training]. He was at Berkeley while I was doing my PhD at Stanford. I think he was transitioning to being an emeritus faculty at that point and he was super smart and super creative. He was the one person counterbalancing three of the great researchers at Stanford whom I worked with and got to a lot of the same questions all on his own. I never actually got to interact with him.
​
What?
- do you like to do in your free time?
I cook a lot, but I also play the viola. I say that being a violist is a bit like being a statistician: everybody thinks they need you, but nobody can tell if you are good at your job.
​
When?
- do you find inspiration for your research?
Either in seminars or in meetings with collaborators. I am not terribly good at reading the literature. I absorb things best when people tell me about them and they have my attention. There is never enough time and way too many papers to read.
​
Where?
- is your favorite travel destination?
I try not to travel so much these days because I try to moderate (if you can call it that) my carbon footprints. I did a lot of traveling when I was in grad school. If I was going to go somewhere, there is a stunningly beautiful wilderness park in northern Ontario called Killarney where I used to go camping as a kid.
​​​
​
Why?
- did you choose your specific research topic(s)?
It is what comes across my desk when I have an idea on how to do something that no one else is trying. I tend to run away from competition. I do not try and do things better than other people do because, frankly, most of them are smarter than me, but I am good at saying here is something that is interesting to me that people are currently not thinking about.
You can be Carnegie and Mellon and build up a great big edifice that is solid and comprehensive or you can be Lewis and Clark and go off into the wilderness and I know that I am better at exploring.
​
​
How?
- do you deal with setbacks?
I try and learn from them because they usually provide some sort of information. The advice for students working with me is that it is always possible that Giles does not know what he is talking about! It is absolutely alright to say Giles is just wrong. When doing math, you get told you are wrong pretty quickly. It is hard to let go of your ego with how academia is currently practiced, and it can be fairly brutal to be told you are wrong, but I think people are happier if they accept that.
​
…or?
​
​
Attend a party or be the host?
Host, definitely
​​
Museum or movie theatre?
Museum.
​
​
Sneakers or dress shoes?
Dress shoes I think.
​
Optimist or pessimist?
Definitely pessimist.
​
Ambition or comfort?
Neither really. I want to work hard because I like things that I think are good things to do, not because they could be rewarding.
​
See the future or change the past?
I am not sure that I would want to do either.
​​​
​
The interview was conducted by Nicole Kilian and has been edited and condensed for clarity.
Image sources: Giles Hooker.
​
​​​
Follow Giles:
​​