A note on the strength of note onset strength

A note on the strength of note onset strength

and the mystical transformative healing power of number theory

When we invent algorithms that write music, our research often focuses on pitch, rhythm, timbre, melody, harmony. Note-strength[1] is typically not the direct focus of research, and is thus often neglected. However, any rendition of music that does not vary note-strength will be tiresome. Here we present a quirk of number theory that can be used as a quick and dirty way to apply a default strength to each note as a post-processing step, for music where the beat locations are known.

Let us assume that the notes that are on the beat should be the strongest; those that fall on the half-beat should be the next-strongest, and on the second and fourth quarter beat to be somewhat weaker than that, etc… Assuming the beat occupies times 0 ≤ t < 1, this implies that the strength is inversely proportional to the denominator of the rational representation[2] of the time where the note onset occurs. This just so happens to be related to the Ford circle packing, depicted in Figure 1, where the circle sizes represent the strengths of notes that occur where the circles touch the x axis.


Figure 1: Ford circle packing, shown for denominators not greater than 7. Imagine that time is on the x axis, and one beat is represented here. The size of the circles represent note-strength. A note occurring at time t=0 will be the strongest, time t=1/2 will be the next-strongest, etc. In principle, there should be infinitely many smaller circles here, one for each rational number. We note that the circles have diameters 1/(denom(x)^2), although here we don't care about the absolute size of the circles, only their relative size.

But what if the note onset times are not explicitly quantized, or worse, if a note occurs at an irrational time which cannot be expressed as a fraction with integer coefficients[3]? Should a note that occurs at time t=0.4999999 be imperceptibly weak because its denominator is 10000000? Probably not — it should probably count as 1/2. What we want to know is, in a sense, how rational an onset time is. That is, we want the nearest rational approximation to the onset time, giving preference to small denominators not greater than n.

The Ford circle packing has a nice property that allows us to compute this efficiently for any n. Take any two circles A and B that are touching. By the geometry, there will be exactly one smaller circle C that touches both A and B. Moreover, C will be the largest circle between A and B. The place where C touches the x axis is the Farey sum of the place where A and B touch the x axis. The Farey sum is defined to be

A ⊕ B = C ≝ 
num(A) + num(B) / denom(A) + denom(B)


where ⊕ indicates the Farey sum, and num(x) and denom(x) extract the numerator and denominator, respectively of the rational representation of x. This, taken together with the fact that that the Ford circle packing is an infinitely deep fractal with circles on all rational numbers, gives us a means of traversing Figure 1 by successive approximation, to arbitrary precision.

Suppose we want to find a rational approximation of t∈[0, 1), with denominator not grater than n, and return 1/denominator. In pseudocode we can do:


Start with A ← 0/1, B ← 1/1

Loop until denom(C) > n:
    compute C ← A ⊕ B;
    if t < C
        set B ← C
    else t ≥ C
        set A ← C
    if A = t
        break;

result ← A or B, whichever is closest to t
return 1/denom(result)



The first four iterations of this algorithm are depicted in Figure 2.


Figure 2: The first four iterations of the algorithm that finds onset strength of a note whose onset time is t = 0.49. It starts with A=0/1 and B=1/1, and computes their Farey sum, C. It moves either A or B to C, keeping t bounded by A and B. It repeats this until either C lands on t, or to the specified search depth. If the algorithm terminated after the last step depicted here, it would return 1/2 (the closer of A or B to t), but with more iterations it will eventually get a closer approximation, and eventually 1/100 as 0.49 is 49/100 exactly.

For clarity, a full c-language implementation is provided in Example 1, complete with proper boundary checking.


float rhythm_default_strength_for_onset(float onset_time, int n)
{
  int num_a=0, denom_a=1, num_b=1, denom_b=1, num_c, denom_c;
  float a,b,c;
  
  onset_time -= floor(onset_time);
  
  while((denom_a + denom_b) <= n)
    {
      num_c = num_a + num_b;
      denom_c = denom_a + denom_b;
      c = num_c / (float)denom_c;

      if(onset_time < c)
        {num_b=num_c; denom_b=denom_c;}
      else
        {num_a=num_c; denom_a=denom_c;}

      a = num_a / (float)denom_a;
      if(a == onset_time)
        break;
    }

  b = num_b / (float)denom_b;
  if(fabs(b-onset_time) < fabs(a-onset_time))  
     denom_a = denom_b;

  return 1.0 / denom_a;
}
Example 1: Complete algorithm to compute note-strength.


Figure 3 depicts the onset strength for all possible note locations in a single beat ( all values of t∈[0, 1) ), with n = 8. This shows that the onsets that occur on or near the 'most rational' parts of the beat are the strongest.


Figure 3: Onset strengths for all onset times within a single beat, with n = 8.

I previously stated that the onset times do not need to be explicitly quantized, but clearly this algorithm is related to quantization. In fact, it is equivalent to quantizing the onset times to a Farey sequence of order n — i.e. the ordered list of all rational numbers with denominators not greater than n. Note that this is not equivalent to a grid-based quantizer. A grid with spacing 1/8 would not quantize triplets, whereas a Farey sequence with order 8 would, as well as quintuplets and septuplets. Additionally, the Farey sequence is not more likely to mis-quantize notes near the beat than the grid, because Farey only has higher temporal resolution only near finer subdivisions of the beat. So, I would predict that Farey would perform better than a grid for quick-and-dirty quantization, although for some reason grids are everywhere and I have never noticed Farey used this way.

In conclusion, we have presented an algorithm that assigns default strength values to notes within a beat. The same algorithm could also be used as a simple quantizer (of course, returning num/denom instead of 1/denom). That is it.

Notes

1) I use the term ‘note-strength’ here to refer generically to a collection of related terms like loudness, velocity, and other features that could be driven using the method presented in this paper.

2) By ‘rational representation’, I mean that the number is expressed as a fraction with coprime integer coefficients.

3) If you throw a dart at a number line, you will, with 100% probability, hit an irrational number.

Comments

Popular posts from this blog

Ambisonic Rendering in the Story Bubble

How I calibrated my contact microphone

WaveRNN