[Edit January 30, 2019:
After receiving some feedback about this post, I would like to address a few things that I think were not clear enough. I’d like to thank Scott Rao for his comments.
First, when I use language such as “over-extraction”, and “under-extraction”, I don’t mean that the associated extraction numbers are necessarily undesirable. What I really mean is “more extracted” and “less extracted” – the actual level of extraction that is desirable depends on several things, one of which is the subjective sensory factor. Another is the narrowness of the particle size distribution generated by a grinder, as I mentioned in the post. So, it would be wrong to say that an “optimal extraction is at 21%” for example; the exact number that someone finds optimal will depend on preference, roast development and quality, and evenness of extraction. The extraction yield numbers that I give in the text are just examples that I threw around, please don’t take them as absolutes.
It also came to my attention that the evidence for fast-extracting compounds being more on the “vegetal and sour” side of taste is speculative, so please take this claim with a big grain of salt. Instead, it would be more careful to say that low extraction yields will generally produce a less balanced overall taste, because only some fraction of all available chemical compounds get extracted. Think of it like listening to music with a very agressive equalizer turned on. The evidence seems to be stronger on the other side of average extraction yields, in the sense that bitterness and astringency are part of the slow-extracting compounds, and they tend to take a lot of space in the perceived taste profile of a cup.
I did not do a detailed consideration of the process of erosion in this post, but it still plays a role even in filter coffee. It is simpler to model because the fines just immediately extract completely in contact with water, so I did not include erosion in this discussion without some data to play with. The amount of fines present in a particle distribution will definitely have a strong effect on the flavor profile of the cup, on top of the size of particles – I will talk more about it in the near future !
Finally, please do take this whole model with a grain of salt – It was not yet tested against real data, I assumed spherical particles, and based all of it on the assumption that chemical compounds extract at a rate that decreases exponentially. My hope is that it will be useful to understand some aspects of extraction dynamics, but it is in no way a perfect model.
Coffee extraction is a subject I’ve touched a few times on this blog. Today I want to have a more profound discussion on this subject, because I recently realized I had a very simplified view of what’s happening during coffee extraction. I’ll go over the basic principles first, and then gradually deeper and deeper in this rabbit hole. This is one of those times where I will be posting some equations, but I will try to translate them in words and figures as we go along, so please don’t feel bad if you don’t know anything about maths. I hope to be able to describe them well enough that you won’t need to have a degree in maths or physics to follow the big picture. The value of equations is that they allow me to see what arises from just a few fundamental suppositions.
Specialty coffee brewers often talk about total dissolved solids (TDS) and average extraction yield (EY) when they describe a method or a coffee they brewed. As I briefly described earlier on this blog, the first concept of TDS really describes the concentration of your beverage: espresso typically has 7% to 12% TDS, and filter coffee typically has 1.3% to 1.45% TDS. The second concept of average extraction yield describes what fraction of the coffee beans were dissolved in your beverage. This number is typically between 19 and 23%, and can never go above ~ 30% because the remaining 70% of the coffee beans is just not dissolvable in water.
At first glance, knowing the average extraction yield might seem to be just another, more convoluted way of describing the concentration of your coffee. But it’s not ! Average extraction yield was found to correlate very well with the taste profile of a brew. If you make three brews with the same coffee, and reach 18%, 22% and 27% average extraction yields, then add the appropriate amount of water such that they all have the same concentration (e.g. 1.3% TDS), the three cups will taste very different. The first one will tend to be more vegetal and sour, the second one will be more well-balanced, complex and enjoyable, and the third one will be more bitter and astringent.
Why does average extraction yield correlate so well with flavor profile ? Ultimately, this is due to different chemical compounds extracting at different rates. Some of the compounds that we typically don’t like to taste are very slow to extract (thankfully !), so they will start to become apparent only when you reach high extraction yields. Other components that extract very fast are enjoyable, but if they’re not balanced with other stuff they produce a less interesting cup. In other words, our goal is to extract as much of the good stuff (the compounds that extract at average and fast speeds) as we can, while avoiding the nasty stuff (the compounds that extract at slow speeds).
The concept of an average extraction yield is useful, but it’s not at all the ultimate descriptor of a coffee cup’s flavor profile. Imagine a situation where some of your coffee grounds extract faster than others – the resulting coffee cup might be composed of some grounds extracted at 18%, and others extracted at 28%, and you could still get an average extraction yield around 23% in the cup. If you were to compare this with a cup where all coffee grounds extracted at 23% exactly, you would most likely find the second cup more enjoyable (this is not the one they sell at Second Cup). Basically, the second cup has extracted a lot of the “good stuff”, and very little of the bitter, astringent taste. The first cup however has a lot of coffee grounds that reached a 28% extraction yield, so they will be contributing some of the less desirable taste in the cup.
One practical result that arises from this is that lower quality equipment or brew methods that produce a wider range of extraction yields will only allow you to reach average extraction yields around 20-21%. If you go any higher than this, then you will start getting too much of the bitter taste. If you manage to produce a brew where the extraction of individual coffee particles is much more uniform, then you will be able to reach higher average extraction yields, about 22-24%, without getting too much of the bad stuff.
One thing that can explain why your coffee particles may not all extract at the same rate is the fact that they may have different sizes. As Scott Rao explains nicely in this blog post, there are two completely different physical processes by which coffee extracts: erosion and diffusion. Erosion happens when a coffee cell is broken and water can very easily wash away all of the dissolvable compounds that it contains. As coffee cells are very small (around 20 microns), this happens only at the surface of coffee particles, where some broken cells are exposed, or in coffee particles so small that all coffee cells are broken up. In this scenario, water dissolved the full ~30% of anything that can be dissolved very fast. As you may have guessed, erosion is the dominant process in espresso or Turkish brews, because those use very fine grind sizes.
Diffusion is the process that dominates in filter brews. In this scenario, water has to enter the tiny pores of the coffee cell walls, dissolve the flavors, and come back through the tunnels. As you might expect, diffusion is much slower than erosion. In this post I will focus more on diffusion, because filters brews are my bigger focus at the moment.
Now comes the part I did not understand very well until very recently. One thing I mentioned earlier on this blog was that smaller coffee particles extract faster than the larger particles. This was actually kind of true, but my reasoning was not. I was really confusing the extraction of a single coffee particle with that of a population of coffee particles. If you have a collection of very coarse coffee particles, they will collectively extract much slower than a collection of very fine coffee particles, because the finer particles are presenting much more total surface for the same total mass of coffee.
If you look at a single coarse particle and a single fine particle however, and measure how fast they provide flavor compounds, then the picture is quite different. The single fine particle is much lighter, and has a much smaller total surface than the single coarse particle, so it is actually the coarse particle that would win the race to higher concentrations. Assuming the fine particle is large enough that we are still within the regime of diffusion, each cell at the surface of the fine coffee particle is extracting at the exact same speed as each cell at the surface of the coarse particle.
The last paragraph is really key to understanding why I have been thinking a lot about this lately. It’s worth reading it again and make sure you understand it well. Once you do, something might become clear to you: a population of finer grinds will reach higher beverage concentrations faster, because you have a large number particles and they collective provide coffee compounds faster than a collective of coarse particles, because of their larger total surface area. Our picture of how TDS depends on grind size is quite clear.
BUT, once you accept that each coffee cell at the surface of each coffee particle extracts the same way and at the same speed regardless of the particle size, then it becomes entirely mysterious why different grind sizes or different particle distributions would produce different uniformities of extraction yield, and different taste profiles ! If this was the whole picture, then the only thing we would ever care about would be the beverage concentration (in % TDS), and all coffee cells would always be providing us with the same flavor profiles whether they are attached to a large or a small coffee particle.
I think the key to understand the link between the distribution of particle size and the distribution of extraction yield is something else: deeper layers of coffee cells extract slower than surface layers. Imagine you had only two layers of coffee cells that can be reached by water, and the deeper layer extracts much slower than the surface layer. Now imagine you have two spherical coffee particles, one that is just as large as two layers of coffee cells, and one that contains thousands of layers of cells. Let’s draw this:
It might become obvious from this drawing that the amount of second-layer cells is much smaller than the amount of surface cells in the small coffee particle. In the case of the very large particle, they’re almost equal ! This immediately provides a way to understand how different-sized coffee particles are providing different flavor profiles. The small particle will be producing a more uniform extraction yield, because it is composed of one surface layer extracting uniformly, plus a small contribution of a deeper layer that extracts slowly. The combination will be a little bit non-uniform. It will be skewed slightly on the low extraction side, because of the small contribution of these second-layer coffee cells. The larger coffee particle will produce a much less uniform extraction, because the contribution from the slowly extracting second-layer cells is as big as that of the surface cells. Once again, I think this might be easier to understand with a figure:
In real life, water is able to reach a bit deeper than two layers of coffee cells. In one of my earlier posts, I discussed a recent experiment carried out by Barista Hustle, which demonstrated that water can reach down to approximately the 5th layer of coffee cells on average. If you haven’t watched their video, it’s worth it – this is what made me realize that I was misunderstanding the details of extraction.
So now, we saw that each size of coffee particle produces a distinct profile of extraction yield, and therefore a distinct flavor profile. We also saw that coarser particles inevitably produce less uniform extractions. You can now see why using a grinder that produces a very wide distribution of coffee grind sizes might be a problem: you are mixing up lots of different flavor profiles. However, this new way of thinking about extraction might also have you realize that a perfectly uniform particle distribution will not produce a perfectly uniform distribution of extraction yield !
Instead, such a perfectly uniform particle distribution would just produce exactly the same extraction yield distribution than a single coffee particle would – and it is not uniform. Still, the final extraction yield distribution will be tighter if your particle size distribution is also tighter, which is desirable. It still came as a shock to me that even with a light years-wide roller mill grinder, you will not obtain a perfectly uniform distribution of extraction yields, unless you also use coffee particles that all contain exactly 2 x 2 x 2 intact coffee cells (OK, maybe you can do this if you have such a large grinder).
There’s also another consideration about grind size which I did not touch in this discussion: coffee waste. The coarser you grind, the larger will be the total mass of coffee that is inaccessible to water. This means that, in addition to changing the taste profile, grinding coarser is in some way similar to also using a smaller coffee dose. I won’t discuss this more in this post, but it’s worth remembering it.
Now let’s do some maths
Now that I’ve tried to lay out the concepts with hand waving explanations and drawings, I’d like to attempt formalizing it with equations. Those not too versed or interested in maths may find the rest of this post anywhere between boring to insufferable. I find it really interesting to be able to write down equations to describe a system and see where it leads me. Often, this is a way to realize some consequences that you may not have foreseen, and I think some of you will find value in the figures below (or even in the equations).
The first assumption I will base this formalism on is that each of the chemical compounds in a coffee cell gets extracted at an exponentially decreasing rate:
In this equation, m_i is the amount of mass extracted from a chemical compound that we would call “compound number i“, t is the amount of time since the beginning of the extraction, and τ_i is the characteristic time needed to extract the compound: it’s larger for the more slowly extracting compounds. The left-side of the equation is a time derivative, which means that it describes the rate of mass extraction per unit of time. This might seem like I pulled this equation out of nowhere, but it’s something that arises quite often in these kinds of problems: there is initially a lot of different ways for water to enter in contact with large amounts of the solvable compound, and the least of it remains in the coffee cell, the slower the extraction rate becomes. I’m not convinced this is the ultimate way to characterize this problem, but I think it’s at least a good one.
This equation tells us about the rate of increase of the compound, but what we really want to know is the amount of extract that ends up dissolved in water as a function of time. To obtain this, we need to solve the equation above (which I won’t do in detail here). The solution is:
where a new constant M_i was introduced, representing the total mass of this particular compound inside the coffee cell. At this point, it would be worth visualizing what this equation looks like:
Now, given that each chemical compound extracts at its own speed, obtaining the total mass of everything extracted requires you to take the sum of the equation above, for all available compounds:
Now, what does the sum of lots of different extraction equations like those look like ? It’s really hard to tell if you make no assumption at all about the collective properties of the extraction rates τ_i. One way to go around that is to do it numerically, or something else we can do is ask what the result looks like if all the extraction rates τ_i are close to one another, and thus close to an average extraction rate τ. A mathematical way to express this is:
Here, τ is the average characteristic extraction time, and ε_i is just a symbol I decided to use to express the small deviations around the average, for each compound. It may seem weird that I defined this equation with respect to the inverse of the extraction times, but it will make the subsequent maths easier. Now, I need to make the approximation that the deviations are very small with respect to the average:
And this will allow me to simplify the equation for the total extracted mass as a function of time, with a neat trick that physicists love, called a Taylor expansion around ε_i = 0:
The technical term for what I just did there, besides annoying most of my readers, is a first-order approximation (literally, not just figuratively). You might notice that the first term on the right part of the equation is very similar to the equation we had for a single species, with τ_i replaced by the average τ. This is very neat, because it tells us that this very similar equation is a zeroth-order approximation of the real solution. This means that it captures the largest portion of the answer, as long as the ε_i factors are small like we first assumed.
The second term, which looks a bit more complex and still has this big Σ symbol that represents the sum of many terms (i.e., all the chemical compounds), is a first-order perturbation. If you add it, your answer will be more precise. There is an infinite number of smaller and smaller terms that you could add, which would make your answer more and more precise. If you added this infinity of terms, the solution would be valid regardless of whether all the ε_i are small or not. It turns out that the zeroth-order approximation is quite good (to 1% precision) if you have a lot of chemical compounds (at least 100) even if some extract ~15% faster or slower than the average:
Now that we have described more formally what happens to a single layer of cells, we can turn our attention to the more general case where there are more than one layers. Without any detailed experiment, we need to make an assumption about the rate at which water is able to access the deeper layers. Intuitively, I see water diffusing in the coffee particle like an ensemble of small creatures that walk around randomly, and have a very small chance of getting through a door which leads to a deeper level of cells, or one that leads to a shallower level. In order for water to grab some compounds from the deeper layers and bring them back out, it will need to be able to pass back and forth through several doors.
What this kind of scenario tends to produce is also an exponentially decreasing access to the deep layers (by this point, do you think I love exponentials ?). To be sure about that, I decided to actually run such a simulation, where I took a million “droplets” of water that have a 0.1% probability of crosser over to a deeper or shallower layer at every step of time. I ran the simulation for ten thousand time steps, and every time a droplet came out of the coffee particle, I asked it how deep it reached. I then made a figure with the distribution of depths that each droplet reached:
In other words, the deeper layers will extract exponentially slower. If we decide to call s the coordinate that points inward to the deeper layers, then the characteristic time of extraction τ_i will be the combination of an intrinsic rate τ’_i dependent on chemistry, and the depth x:
The new parameter λ represents a characteristic depth before which most of the extraction happens. If it’s small, then the extraction will only happen in a very thin shell of the coffee particle, and if it’s very large, some extraction may happen deep into its core. In reality, thanks to Barista Hustle we know that this parameter λ is probably of the order of 100 microns.
This new level of complexity means that we need to sum the extraction equations over all depths, each having their own extraction speed. The result is:
In that equation, R is the radius of a spherical coffee particle and s is the typical length of a coffee cell (about 20 microns, we assumed the cells are cubes). The index k is representative of the layer, where k = 0 is the surface. There’s a term in (R – k s) squared that appeared, which is due to the geometry of a spherical particle; each level deeper has a smaller amount of cells in it. For non-spherical particles, this term would be a bit less steep. The spherical case scenario is the most dramatic one, in the sense that the deep layers are slowest to extract (this is because spheres have maximal curvature).
There is a way to make the equation above a bit more easier to deal with, by assuming that the cell layers are continuous instead of discrete. This is not true in real life because there are no “half-cells”, or fractions of cells, extracting in their own particular ways. However, I believe a continuous model would be more realistic because it would be more similar to the results of irregular layers of cells, where not all cells in a given layer are exactly at the same depth, or have exactly the same number of entry ports. These small random deviations in the exact extraction speed within a given layer of cells will produce a similar effect to the assumption of continuous layers of cells. Thus, here is the continuous version of the equation above:
In this equation, x represents the depth inside the coffee particle (expressed in the same units as the radius R). One may be tempted to solve this integral and find a form for the extracted mass versus time that is easier to work with, but don’t go there – you would encounter a dreadful beast that has many names, one of which is the confluent hypergeometric function. It takes three sets of arguments, and is a real nightmare to deal with (to all readers that are thinking right now “what the hell is this guy rambling about“, I apologize).
Now that we re-framed m_i in this way, it now represents the full extraction output of chemical compound number i for a full coffee particle, instead of just a coffee cell. Let’s see how its rate of extraction would be affected by the fact deep layers extract slower:
Basically, the overall extraction for the spherical coarse particle is slower, and reaches an inflection point where this become really slow near 3 times the characteristic extraction time τ. If you’re willing to, go re-watch the Barista Hustle experiment, you might notice this red curve looks a lot like the cupping bowl that contains coarse coffee grounds !
Now, another interesting aspect is to estimate the contributing fraction of a fast-extracting compound to the beverage, as a function of time. Obviously, the concentration of this compound will be at its highest when the brew just started, because it had a head-start from being a fast-extracting compound. Here’s what this would look like:
… and if we did the same thing for a slow-extracting compound:
The figures above show how even this contribution of a given chemical compound to the cup’s flavor evolve differently for particles of different sizes. One way to go even deeper is to look at the distribution of extraction yields per coffee cell, in terms of their contribution to the total beverage by weight. Let’s look at the result for four different particle sizes, and three different brew times:
There are a few things we can learn from these figures:
- Different-sized coffee particles provide different flavor profiles to the cup.
- The highest extraction yields are always the top contribution, regardless of particle size (they correspond to the collective outermost layer of all particles).
- A coffee cup made with a perfectly even distribution of particle sizes will not taste like a set of perfectly evenly extracted coffee cells.
- Differences in flavor profiles for different particle sizes should be more stark for shorter brew times.
There’s another interesting thing we can look at with this model: How does the average extraction yield depend on particle size ? To do this, I ran a simulation at ten different brew times, from one to ten times the average extraction yield:
There’s nothing shocking here: smaller particles extract faster, and longer brew times lead to higher average extraction yields. The more interesting part is that these curves are not easy to reproduce with a simple equation. Even if we were to look at the speed at which each particle extracts as a function of its radius, or surface, the result is not a nice power law, like the naive “extraction speed” = “1 / particle surface” assumption I once made in the past. Rather, it’s a relatively complex functional form, that needs to be modelled properly instead of approximated !
The models I developed in this post will be used to translate particle size distributions into distributions of flavor profiles (via the distribution of extraction yields), in a future application I will release soon. To the hardcore geeks that made it this far in the blog post, I congratulate you two !
I’d like to thank Mitch Hale for a discussion that helped me put some order to these thoughts, and Noé Aubin Cadot for helping me with figure out some Mathematica stuff.