by Max Frenzel: How much coffee is too much? A case study and tutorial on self-tracking to improve sleep…

Awaken

little while ago, I wrote about an experiment that I had done several years back. I had recorded every single caffeinated beverage that I’d consumed over the span of about a month. I looked at the results, and analyzed the patterns that I noticed.

However, I have to admit that I never really answered the titular question: how much coffee is too much?

I needed to dig deeper. In addition to analyzing my caffeine consumption in detail, as well as alcohol and exercise, and their effects, I wanted to give you some general guidelines for self-experimentation, as well as specific tools and the code to replicate my experiments and analysis.

Read on for those results.


Before we can answer how much caffeine is too much, we need to first clarify what we even mean by “too much”. We need to look at caffeine’s influence on other parameters of interest.

Sleep seemed like the most obvious metric to me. Besides, I have already been tracking my sleep for several years now with an Oura Ring. I bought my original one through their Kickstarter campaign, and recently upgraded to the second generation when it was released. So, the data was already readily available to me. (“Readily available” is important: when tracking your behaviors and habits, the more complicated the process, the less likely you are to stick to it.)

However, looking at two metrics like caffeine and sleep in isolation completely ignores the complexity and multi-faceted nature of the real world. In order to get more meaningful insights, I also wanted to track additional metrics that I assumed would show some correlations with sleep.

I settled on alcohol consumption and exercise because I felt that both have a very strong influence on sleep, and are also relatively straightforward to quantify and track.

These are simply my choices. If you decide to replicate this experiment yourself, I encourage you to choose the variables that you think matter to you. Maybe you smoke and want to see what effect that has. Or you are experimenting with different diets and feel like that has an impact on your sleep. Whatever makes most sense for you, that’s the right thing to track!


Tracking Data

There are countless tracking apps available, and I’ve tried plenty of them. Many come with analysis tools and tons of additional features and functions. They might be great for some people and some specific applications, but for me they’re usually trying to be too clever, trying to do too much.

What I need is a simple tool that allows me to log whatever I want, make my own trackers, and then be able to extract the raw data so that I can build my own analysis tools.

The perfect app I have found for this is the extremely simple no-nonsense rTracker. It allows completely customized trackers and export to CSV, without much else. Exactly what I needed.

Within rTracker, I created two custom trackers: one for caffeine and one for alcohol.

For the caffeine tracker, I loaded in several common caffeinated drinks I tend to have as checkboxes. This allowed me to quickly enter these, as well as an additional field to note down exact caffeine amounts (in milligrams) from drinks that didn’t fall within one of these categories.

That way, it usually took me no more than 5 seconds to log any caffeine, and after a few days it became completely habitual. Even if I forgot to log a drink, rTracker allows for editing the time of the entry so I could easily add omitted drinks at a later time.

For tracking my alcohol consumption, I initially intended to follow a similar pattern: logging each drink. But given that it is much harder to quantify the amount of alcohol, and also the fact that I might get a bit inaccurate with the logging on particularly boozy nights, I decided to simplify by assigning four labels to general types of alcohol consumption.

“None” is self-explanatory; “A bit” refers to a glass of wine or a beer; “Quite a bit” is in the order of three to five beers; everything above that I defined as “A lot”. This is clearly not particularly precise, but it’s a decent tradeoff between ease of use (again, very important if you actually want to follow through with something) and precision.

I usually logged alcohol the following morning, when I also logged my first coffee of the day.

Beyond the analysis below, I experienced another benefit to tracking: often, the fact that you’re tracking something at all can have an impact that’s more important than the precise results or what you do with the data. Tim Ferrissand Kevin Rose discussed this point on a recent episode of The Random Show. The sheer habit of tracking alcohol makes you more conscious of your consumption. You don’t want to have another day where you have to enter “A lot” in your tracker.

For tracking all my sleep related metrics, I simply used the data I was already collecting with my Oura ring. If you are interested in sleep tracking, I really can’t recommend Oura highly enough. Ever since the first version, I’ve been extremely impressed by their results, and they also allow you to download all your data as CSVs through their Oura Cloud service—perfect for people who want to go beyond the basic (but already really good) analysis provided by Oura.

Every single day, the full Oura data comprises 54 individual metrics. That’s plenty of data to do some interesting exploration.

I also used Oura for my exercise/activity tracking. While most of this was tracked automatically, I didn’t wear it for my heaviest workouts, CrossFit, mainly because lifting heavy weights is not that comfortable with a ring, and I was scared to scratch it. Luckily, Oura allows you to add additional activities, specifying duration and intensity on a scale of “easy”, “moderate”, and “hard”, so I added those workouts myself.

CrossFit sessions vary dramatically in intensity, so to simplify I logged a 60-minute CrossFit session as 50 minutes of constant moderate exercise, unless it was a particularly easy or hard day, in which case I would adjust it accordingly. Again, this way of tracking is a compromise between precision and settling on something that’s simple enough so that I actually do it every time.

With these tools and methods, I tracked my behavior and consumption for a total of 91 days. Then, I exported all the data as CSV files and analyzed the results, heavily relying on the excellent Seaboarn library for visualizations, in a Jupyter Notebook which you can find here.

For more details on how exactly I modeled my blood caffeine concentration from the caffeine consumption data, have a look at my previous article (or the code itself).


Notes on How I Treated the Data

Before diving into the results, a small side note on the analysis. I decided to look at the data in quite a simple way, treating each day as a completely independent data point.

While I was technically dealing with time series data, looking at points in isolation is much simpler and still gives some valuable (and more readily interpretable) insights.

Many standard methods and correlation measures for time series data are notoriously difficult and often rely on the assumption of “stationary time series.” This means that there is no seasonality in the data. This is clearly not the case with this data, as we shall see below, for example with weekdays vs. weekends exhibiting some clear seasonality.

To get truly accurate insights, I should really take the time series nature into account — it will certainly have an effect. Both heavy training and heavy drinking will have effects that potentially span over multiple days, and also one night’s sleep will likely have an influence on the consecutive day’s sleep.

In terms of establishing correlations between different variables, I largely relied on simple linear regression. This is also an extreme simplification, since many of the interactions are probably far from linear.

Still, the results I show here — which treat each day as a completely isolated data point, and assume linear relations should reveal some general trends — are the best tradeoff I found between simplicity and accuracy.

With this out of the way, let’s take a look at the actual results.


Looking at Basic Caffeine Data

To get some basic overview of the data, let’s first look at caffeine in isolation. As a simple first step, we can visualize my blood caffeine content over the entire duration I was tracking it. The plot below shows the absolute amount of caffeine I had in my body (at least, according to my simple model) over the entire duration of the experiment. Maybe a concentration such as “mg of caffeine per kg of bodyweight” would have been more meaningful, but since I’m only considering myself (and don’t think my body weight fluctuated all that much during the experiment), the two metrics are essentially equivalent.

All graphs by the author.

Besides the repeated spiking pattern with roughly one peak per day and maybe some interesting outlier days, this is not particularly insightful.

Zooming into this on a span of five random days show a bit more detail.

Spikes in caffeine roughly 30 minutes after consumption are now clearly visible. These tend to accumulate over the day and then decay off into the night until the pattern is repeated the next morning. Still, this in itself is not very informative.

We get a much more interesting plot if we average this data over all the days.

The red curve represents my caffeine concentration on an average day. I further separated this out into weekdays (blue curve) and weekends (green curve). The dashed lines show the peak caffeine concentration. The dotted lines (at the bottom left) show the caffeine concentration at my average bedtime (determined from my Oura data—more on that later).

One first thing I was quite happy to see is that I roughly halved (!) my peak caffeine levels since my previous experiment in 2015.

The difference between weekdays and weekends is also quite interesting. On the average weekday I tend to get up between 8am and 9am, make a coffee, and then sit down to read for 30 minutes to an hour before going to work. I often start my workday at a coffee shop, where I get my second coffee. This is reflected in the steep slope between 9am and 11am.

On weekends, except for a few outliers clearly visible in the average, I tend to sleep longer, thus also postponing my first coffee to a later time.

In general, I consume more caffeine on workdays then on weekends — probably quite a common pattern. However, being conscious that I will most likely want to sleep reasonably early, I try not to drink any caffeine past 4pm. On weekends, since I usually expect to go out and sleep later, I don’t mind drinking caffeine later in the day, leading to a later (albeit much lower) peak.


Comparing Caffeine and Sleep

We now understand some very general patterns about my caffeine consumption. It’s time to go beyond caffeine in isolation and bring sleep into the picture.

One metric that accurately measures quality of sleep is the average resting heart rate during sleep. If that is unusually high, you are likely to have had an agitated and less efficient sleep.

To see the relation between these two variables, we can plot the caffeine concentration at bedtime against the average heart rate and do a simple linear regression.

According to this plot, there does indeed seem to be a noticeable correlation between caffeine and heart rate. The plot seems to suggest that the more caffeine I drank, the lower my heart rate was!

Wait, what?!

That clearly goes against my expectations! Do we have some noteworthy medical discovery here? Something is fishy. Or rather, boozy.


Adding Alcohol to the Picture

As noted before, there are many complex interactions at play in this data, and ignoring them could lead us to come to some very rash conclusions.

A hint at what’s going on is actually buried in the average caffeine plot above. On weekends—days on which I’m more likely to drink larger amounts of alcohol—my bedtime tends to be later. A later bedtime means my body had more time to process the caffeine (besides potentially never having had that much caffeine in the first place). So, there is a correlation of higher alcohol and lower bedtime caffeine, which then also leads to the correlation (but certainly not causation) shown in the plots above.

Removing all points that correspond to days where I drank “Quite a bit” or “A lot”, the plot starts to look different.

Now there seems to be almost no correlation between caffeine and heart rate—certainly not enough to be called statistically significant, or allow us to speculate that one might cause the other.

This simple example shows that alcohol clearly has a very strong influence on sleep. Let’s look into this a bit more deeply.

First of all, we can confirm what I have already alluded to above about the later sleep time and resulting lower caffeine concentration on boozy days.

In the “Bedtime Hour” plot, 0 corresponds to midnight, -1 to 11pm, 1 to 1am, and so on. We can see a very clearly trend, especially towards the higher alcohol end. Whereas on “normal” days I tend to go to bed around midnight, the more I drink, the later I go to bed. This has, as expected, a very clear influence on caffeine levels at bedtime.

Similarly, looking at the distributions of average and lowest heart rates across alcohol categories, another clear pattern becomes visible.

The difference becomes even more obvious when we look at it in category plots.

There is absolutely no doubt: the more alcohol I had, the higher both my average (as well as my lowest) heart rate was. I would have certainly expected to see a difference, but not of that magnitude. Whereas without any alcohol my average heart rate is around 45bpm and drops below 40 at its lowest, when I had a lot to drink it averages out to over 60 and rarely drops below 50bpm.

This is (quite literally) a pretty sobering observation.

For this reason, I have omitted all days with “Quite a bit” or “A lot” of alcohol from the following analysis of caffeine and exercise. I have decided to keep “A bit” of alcohol included for two reasons. First of all, the difference, while noticeable, is not as large as for the other categories. Second, I actually expect some other hidden correlations to be at play here.

For example, I am more likely to treat myself to a glass of wine on a day when I also had a heavy workout. Similarly, more stressful days made it more likely that I’d grab one or two beers with friends, or a relaxing drink at home. Both of these factors might have an influence on heart rate, which could make the difference between no alcohol and a bit of alcohol appear more pronounced than it actually is.

However, before looking at the other metrics, let’s look a bit more at just how bad alcohol is for your sleep.

Heart rate variability (HRV) is very closely related to heart rate, so it’s another excellent measure for how restful a night was, and how well-recovered one is in general.

While we generally assign a single number to heart rate, the timing between consecutive beats is actually quite variable, precisely responding to our body’s needs in that particular moment. A high HRV is actually an indicator that our body is well-rested. The team at Oura wrote a nice introductory article about HRV, if you’re interested in the details.

Again, the now-familiar pattern emerges: more alcohol equals less recovery, by a significant margin.

Respiratory rate is another measure I could capture. As the plot below shows, breathing seems to speed up with more alcohol, although not quite as drastically.

A fairly obvious consequence of all this is that on days after I drank a lot, I was less active and rarely exercised.

As we can see, this manifests in a reduced calorie burn due to exercise. (More on “Activity Burn” below.)

Besides just looking at various metrics like heart rate during sleep, we can also look at the individual sleep phases themselves, and how much time I spent in light, deep, and REM sleep.

Interestingly, the effect on the relative times does not seem to be too strong. Light sleep appears to be largely unaffected, at roughly 50%. Deep sleep, unsurprisingly, seems to slightly decrease with alcohol. What is somewhat surprising is that REM sleep seems to increase with alcohol, apparently giving me more dreams after drinking.

If there is anything positive about consuming lots of alcohol, it’s how quickly I fall asleep.

While a little bit of alcohol appears to increase my sleep latency (the time it takes to fall asleep — more on this later) as well as the restlessness, a lot of alcohol actually seems to help here. I guess I basically just pass out straight away.

Finally to sum everything up, we can simply look at Oura’s Readiness Score, a combination of many of the relevant factors resulting in a simple score between 0 and 100.

Overall, the verdict is pretty clear: While a little bit of alcohol doesn’t seem to do too much harm, more than a few drinks messes with our recovery and readiness the next day, and our long term health overall. This is not really a surprising finding, but seeing the hard quantitative facts definitely helps drive that point home.


The Effects of Exercise

Next, before diving into caffeine, let us take a quick look at exercise. As noted above, my way of tracking exercise was not necessarily the most accurate, but it should still give some useful insights. However, take the observations with a grain of salt.

As the measure of exercise, I decided to use Oura’s “Activity Burn” metric: the estimated calories burned through activity.

After exercise, our body needs more recovery. This largely happens during deep sleep. And this is noticeably reflected in my sleep ratios.

While light sleep (just like with alcohol) was fairly unaffected by the amount of exercise, hovering at around 50%, the proportion of deep sleep tends to increase with more activity, at the cost of REM sleep.

One expectation I had before starting this experiment was that heavy exercise, particularly in the evening (a factor which I did not track) would lead to more restless sleep and higher sleep latency.

However, this is not confirmed by the actual data. Both restlessness and latency seem to actually slightly improve with activity.

I have to mention here that I have some doubts about the accuracy of the latency values. While I don’t doubt any of the actual sleep values measured by Oura, the latency, i.e. the time from going to bed to falling asleep, generally seems to be an underestimate.

This is not surprising given the kinds of sensors the Oura ring is equipped with. Figuring out when I actually fell asleep just from accelerometer data and my pulse is a hard task. Particularly on very restless nights, I noticed that the latency was very underestimated—probably due to me tossing around a lot in bed, making the ring conclude that I’m still up.

Had I thought about this in advance, I could have manually tracked my bedtime and used that to calculate more accurate latencies. But unfortunately, I didn’t.

Falling asleep has always been a bit of a struggle for me, so I was particularly interested in the effects of the various factors on latency. I hope that despite these concerns, I can still gain some valid insights from the present data.

Nevertheless, I do trust the restlessness measure, and that seems to agree with the decreased latency and more exercise.

Given my expectation of more restlessness with increased exercise, I also expected a higher heart rate. To my surprise, both the average and lowest heart rates seem to decrease with exercise (rather than increase).

This might also trace back to the larger fraction of time spent in deep sleep, and less in REM sleep, which often exhibits temporary spikes in heart rate.


Back to Caffeine: Beyond Heart Rate

Finally, let’s get back to caffeine. We’ve already seen its effects (or lack thereof) on the heart rate.

A slightly more noticeable — although still minor — effect can be seen on restlessness, latency, and sleep efficiency (time in bed actually spent asleep).

Overall, less caffeine appears to be better for these factors, but the effect is nowhere near as significant as I had feared. Certainly not in light of the strong influence of other factors, some which I have talked about and others which I’m probably not even aware of myself.

Sleep latency does tend to go up with more caffeine overall, but this is largely due to a few outlier days, which might have any number of reasons besides caffeine.

I was quite happy to see this, and as a result have been enjoying my daily coffee habit (up to a certain extent) even more, and with less guilt than before.

Of course, there is also the risk that I’m just interpreting the results in this way because that’s kind of what I hoped to confirm, but let’s ignore that possibility for now 😉.

To conclude the caffeine analysis, let’s have another quick look at sleep phase ratios:

Once more, light sleep is essentially unaffected. Deep sleep, however, seems to decrease slightly, while REM sleep apparently increases. This could potentially be a result of when the different stages are occurring. Deep sleep tends to be more frequent at the beginning of the night (when caffeine is still high), and REM sleep is more frequent towards the end of the night (when more of the caffeine has been processed by my body).


Conclusions

If there is one conclusion that can be drawn from the above results with high certainty, it’s that alcohol is bad for your sleep. At least, it is for me. I encourage you to do your own experimenting, but I would be surprised if your results show something different.

All other results above, while surely hinting at actual patterns, are not really statistically significant enough to draw very strong conclusions. They are certainly not enough to suggest causality.

As already noted, it is generally extremely difficult to really identify causal relations with so many different and interacting factors involved. I did not take into account when my last meal was, how much blue light I was exposed to at night, if I took hot or ice baths (both of which I did on certain days), how stressed I was, if and when I meditated that day, or if I engaged in any other “pre-sleep-activities”. And those are just some obvious factors. There are probably many more hidden ones I cannot even think of.

Some of these other factors do have a particularly strong and identifiable influence. Particularly, working late at night.

I recently finished another side project that combines music with artificial intelligence, and during the last two weeks of this project I was often working till late at night, more or less right before going to bed. Once in bed, my mind simply couldn’t quiet down. I was tossing and turning, with music playing in my head and new ideas constantly coming up. Even if I managed to get a decent amount of sleep, there were mornings on which I woke up completely exhausted because my sleep was almost entirely light sleep.

This shows that the quality of sleep is at least as important as quantity (and in most of the considerations above I didn’t even talk about overall quantity). An example of such a night was October 19. Below are my sleep stages for that night, taken directly from the Oura app.

I was already extremely tired from previous days, so I tried to get a solid amount of sleep, but despite my full 9 hours in bed (and most of it actually asleep), I was completely exhausted the next morning.

My REM sleep — largely responsible for mental recovery — was abysmally low, at only 5%. And immediately, the first thought upon waking was back to the project I was working on.

For comparison, here my sleep stages on November 9, a more normal night without the stress of working till late.

Effects like stress and working late are much harder to quantify and track, but undoubtedly no less real.

One simple advice to be drawn from this is to not work right before going to bed. Allow your mind time to settle down and switch off from problem-solving mode.

For me, reading fiction for at least half an hour before bed has been quite effective. In general, the more habits or triggers you can build to tell your body it’s time to wind down, the better.

For example, I also use an aroma diffuser with lavender oil before going to bed. While lavender is said to have relaxing properties, I’m not fully convinced by the evidence. However, I don’t doubt that it works as a trigger for my brain if every evening before sleep I smell lavender.

Finally returning to the original question I set out with: how much caffeine is too much? My analysis showed that for me, the effects are not as strong as I had feared. I can happily enjoy a few cups of coffee (or tea) per day without affecting my sleep too much.

Especially in light of the strong influence of other factors (particularly alcohol), I can conclude that if I want to improve my sleep quality, there are probably measures I can take that will have a much stronger effect than cutting down on caffeine.

In general, all the results seem to confirm that within certain healthy limits, none of the factors are too bad. Neither a little bit of caffeine or a bit of alcohol lead to too detrimental results. It only gets problematic when we exceed these healthy limits.

And with this, I’ll now treat myself to a nice cup of coffee — only my second one of the day, well within the limits that are good for me.

Source: Better Humans