Skip to content

Tutorial/Smoothing algorithm is a poor choice #3934

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
af3556 opened this issue Oct 8, 2015 · 63 comments
Closed

Tutorial/Smoothing algorithm is a poor choice #3934

af3556 opened this issue Oct 8, 2015 · 63 comments
Assignees
Labels
Component: Documentation Related to Arduino's documentation content
Milestone

Comments

@af3556
Copy link

af3556 commented Oct 8, 2015

The algorithm used in the Smoothing tutorial (https://www.arduino.cc/en/Tutorial/Smoothing) uses a simple moving average of 10 samples, storing the samples in an array.

Whilst useful in some applications, for simply removing noise from an analog input a simple moving average (IIR filter) is a more efficient and less convoluted method. e.g.:

const int filterWeight = 4;  // higher numbers = heavier filtering
const int numReadings = 10;

int average = 0;                // the average

void setup() {
  // initialize serial communication with computer:
  Serial.begin(9600);
  // seed the filter
  average = analogRead(inputPin);
}
void loop() {
  for (int i = 0; i < numReadings; i++) {
    average = average + (analogRead(inputPin) - average) >> filterWeight;
  }
  Serial.println(average);
  delay(1);
}
@ffissore ffissore added the Component: Documentation Related to Arduino's documentation content label Oct 8, 2015
@ffissore ffissore added this to the Release 1.6.6 milestone Oct 8, 2015
@q2dg
Copy link

q2dg commented Oct 8, 2015

I think introducing the shifting operator brings a extra difficulty to the example...I'm not sure if it is convenient in this case. Maybe adding this code as a "advanced method" in the same page could do the trick.

@shiftleftplusone
Copy link

IMO 2 filters make sense:

  1. a low pass filter
    newavrg = eta * (newval) + (1-eta) * oldavrg // (e.g., eta=0.9)
  2. a median filter (of 3, or of 5) by a ring memory, (although costly and lavish, but always "real values" and shoves out all extremely bad noise peaks)

@af3556
Copy link
Author

af3556 commented Oct 8, 2015

The shift isn't really important - that's a comparatively minor optimisation. Perhaps a note along the lines of "using a power of 2 divisor will allow the compiler to optimise the calculation".

i.e. this works just as well:

// using a power of 2 for the filterWeight will allow the compiler to optimise the calculation
const int filterWeight = 64;  // higher numbers = heavier filtering
const int numReadings = 10;

int average = 0;                // the average

void setup() {
  // initialize serial communication with computer:
  Serial.begin(9600);
  // seed the filter
  average = analogRead(inputPin);
}
void loop() {
  for (int i = 0; i < numReadings; i++) {
    average = average + (analogRead(inputPin) - average) / filterWeight;
  }
  Serial.println(average);
  delay(1);
}

@af3556
Copy link
Author

af3556 commented Oct 8, 2015

Ah, yes, using the phrase "low pass filter" would also be good, to at least introduce the topic and give a good term for further research.

@shiftleftplusone
Copy link

actually
average = average + (analogRead(inputPin) - average) / filterWeight;
is _no_ low pass filter.
a low pass filter would be

average = ETA * (analogRead(inputPin)) + (1-ETA) * average; // ETA: filter weight, e.g. ETA = 0.9

@PaulStoffregen
Copy link
Contributor

Those 2 equations are identical, if 1/filterWeight is ETA. Simple algebra to factor and rearrange the terms.

@af3556
Copy link
Author

af3556 commented Oct 10, 2015

Indeed, same function. The 'filterWeight' version lends itself directly to a pure-integer shift-rather-than-division implementation; the traditional expression typically has 'ETA' (alpha) as a fraction (real).

@PaulStoffregen
Copy link
Contributor

Agreed, from a practical numerical viewpoint, the original is far better.

But for signed division, the complier usually doesn't replace powers of 2 by shifting. That tends to happen only for unsigned numbers.

@shiftleftplusone
Copy link

no, it's not identical IMO, because average is multiplied by (1-ETA) and aditionally (!) newvalue is multiplied by ETA.
I don't see a multiplicant by your newvalue.
But I stand corrected in case I'm wrong.

anyway,
average = ETA * (newvalue) + (1-ETA) * average;
is the common formula for lowpass filters.

@PaulStoffregen
Copy link
Contributor

sigh Basic algebra....

average = ETA * (newvalue) + (1-ETA) * average
ETA = 1 / filterWeight
average = (1 / filterWeight) * newvalue + (1 - 1 / filterWeight) * average
average = newvalue / filterWeight + average * (1 - 1 / filterWeight)
average = newvalue / filterWeight + average - average / filterWeight
average = average + newvalue / filterWeight - average / filterWeight
average = average + (newvalue - average) / filterWeight

@shiftleftplusone
Copy link

you win! :)
but nevertheless, my formula is better :
average = ETA * (newvalue) + (1-ETA) * average
because it's easier, and no division by zero possible, and it's the common formula notation,and because there will be no integer division fraction/cut-off/rounding error or how ever you may call this in English 8-)

@NicoHood
Copy link
Contributor

I also like his formular, although I dont know what ETA means, but i know what it is used for.

I also use this here:
https://github.com/NicoHood/MSGEQ7/blob/dev/src/MSGEQ7.hpp#L135
I just wasnt sure if there was a better implementation for it.

@shiftleftplusone
Copy link

float ETA
is just a weight factor (don't know how to say it better in English).
0<= ETA <=1

by ETA=0.5 its just the arithmetic average (mean), by 0.9 the current readings are more strongly weighted, and by ETA < 0.5 it's just an extremely idle/dull filter, showing almost no peaks any more.

@NicoHood
Copy link
Contributor

Sure. I just wonder what E, T and A stand for. What is does i know for sure ;)

@shiftleftplusone
Copy link

eta is a Greek letter
(alpha, beta, gamma, delta, epsilon, zeta, eta, theta,....)
and is often used im mathematics for constants in analysis.

As it's a constant, I'm writing it in Capital letters as usual for constants in C programs.

@NicoHood
Copy link
Contributor

I never got above epsilon :D
But yeah this function should do it. But please use uint16_t values here, no floating points. (see link above, i do exactly the same)

@shiftleftplusone
Copy link

In New Common Greek it's pronounced "iita", not "eeeta" as in Ancient Classic Greek.

But you may use
float EPSILON;
instead as well - but admittedly you'd have much more chars to write... ;)

Nevertheless, as ETA is from 0...1 only floats will work. In the _real_ world nothing is integer, every number is _real, i.e. (rational, irrational, or even transcendental) a _float. ;)

@Arjan-Woltjer
Copy link

@Vogonjeltzz, I don't know what is better. As Paul already mentioned, his solution can be implemented by bitshift, which is far more economical than your solution. So on paper you may win, in practice Paul is using processing power far better.

@shiftleftplusone
Copy link

ok, for speed on an AVR you are right, but for a tutorial I would prefer the float version I posted nevertheless.
I'm personally using Dues above all, and so the speed by float calculations is no issue generally .

@PaulStoffregen
Copy link
Contributor

FWIW, I don't deserve any credit for the original code, even though I've independently used that same equation in several projects over the years.

When evaluating these algorithms, with either integers or floats, the effects of limited precision should be investigated. Especially important is the case of a small AC or higher frequency signal with a large DC or very low frequency offset. The practical effect of limited numerical precision in the addition or subtraction is a point where special attention should be paid.

I would caution anyone reading this to avoid investing too much confidence in arguments that amount to personal preference, without actual testing or rigorous mathematical analysis.

@shiftleftplusone
Copy link

for educational reasons, the approach by using floats is more intuitive, and additionally, always precise and accurate (related to the chosen value of ETA and the precision of 32bit floats in general):

float average, ETA=0.9;
average = ETA * (float)newvalue + (1-ETA) * average

For filters (of either kind), in general, by a scientific+ statistical point of view, the sensitivity, accuracy, and the evidence has to be proven of course by a statistical evaluation.
E.g. a low-pass filter may severly suffer from bad (e.g., "rectified") noise by independent high non-Gaussian peaks which will badly affect the long-time average. In this case a median filter would be more suitable. But this aspect will overstrain the intentions of a smoothing sensor tutorial for beginners, IMO.

@PaulStoffregen
Copy link
Contributor

using floats is more intuitive, and additionally, always precise and accurate

That's quite an overly broad and confident assumption! Perhaps the word "always" is meant to include ARM chips, where "int" is 32 bits, and "float" uses a 23 bit mantissa?

Again, I would caution anyone reading this thread from placing too much faith in some of these arguments, which amount to little more than casual conversation about personal preferences.

@shiftleftplusone
Copy link

no, "always precise and accurate" is related to the range of expectable results and digits over the whole range of floats even for integer parts ("Vorkommastellen") <1.0 or even <10.0, still providing 6-7 fraction digits, and it's related to the fact that no division by zero is possible which would result in nans ( related to average = average + (analogRead(inputPin) - average) / filterWeight;)

Even more precise would be using double precision floats though.
And remember that an Arduino tutorial is not just an AVR tutorial but also has to target 32-bit ARM or Atheros cpu users as well, which is becoming more an more significant and frequent.

Anyway, (analogRead(inputPin) - average) >> filterWeight is a term which is absolutely misplaced in a beginner's tutorial.

@NicoHood
Copy link
Contributor

I never ever used a single float in any of my c/c++ programs and I probably never will unless I code for a PC and then I will avoid it anyways. If you ask me AVR should not support float at all because its just not the right platform to use it. But a simple smooth value from 0-255 is totally fine i'd say, which can be achieved with the formular too.

@shiftleftplusone
Copy link

yes, sure it's possible, and there are many more filters possible, too, e.g. just a mean average or a median filter or even a Kalman filter or a MonteCarlo filter.

But this issue/ topic is about a tutorial of a low pass filter (CMIIW), and therefore the point is about an educational approach to make it capable for beginners.

Floats or not floats is a complete different issue - to me I'm always using floats because my programs require floats. Just everyone to his special needs.

@NicoHood
Copy link
Contributor

Just everyone to his special needs.

Yep have fun XD

I'll leave the choice to @agdl or maybe @tigoe can help here.

@shiftleftplusone
Copy link

you should explain:
filterweight is a constant factor for smoothing.
use it around 0.9 for a little smoothing,
0.7 for more smoothing
and even lower for very heavy smoothing

(english is not my native language, you're supposed to know what I mean nevertheless)

@tigoe
Copy link
Member

tigoe commented Oct 12, 2015

Okay. You want me to play devil's advocate until we get the terms, I can do that.

What is a constant factor?

Why 0.9 and not, say, 32?

Why 0.7 instead of, say, -1?

What do you mean by weight?

@shiftleftplusone
Copy link

it's 0..1 by definition. Otherwise it won't work

@tigoe
Copy link
Member

tigoe commented Oct 12, 2015

Who defined it and why did they define it that way?

@shiftleftplusone
Copy link

the author of the filter defined it for the filter to work.
But that's a really stupid attitude.
In maths it is as it is, period.

people who don't understand even that better don't start programming, at least no filters.

using the term "confidence" instead of "filterweight" wouldn't make it better - just the opposite.
Maths must not be suggestive, everything has to be just as it is by definition.

@q2dg
Copy link

q2dg commented Oct 12, 2015

It's a number which gives some elements more "weight" or influence on the
result than other elements in the same set (from Wikipedia).
Its value goes from 0 -cancelling the presence of that specific element- to
1 -being transparent-. For instance: a value of 0.5 will give to associated
element halt the importance of the others.

2015-10-12 19:36 GMT+02:00 tigoe notifications@github.com:

Who defined it and why did they define it that way?


Reply to this email directly or view it on GitHub
#3934 (comment).

@tigoe
Copy link
Member

tigoe commented Oct 12, 2015

I disagree. I see 110new students every year, graduate students, and a good 20% of them are interested in programming, but have been turned off by math sometime in the past. Often it's because people told them they "should know". We made Arduino mainly to make it possible to explain programming to these kinds of people. And it is possible. I've seen it work.

But it requires that we let go of the notion that someone "should know". I can show you engineering students who can run rings around many people in this thread but wouldn't know a gerund from a whole in the ground. And when they're in a room with writers who can't program, and they both let go of their assumptions and trust each other, they teach other and they both learn. That is why I'm pushing on this. I want to help people feel like they understand technology, not just those who are good at it.

@q2dg, thank you. That is a much more usable and succinct answer.

@shiftleftplusone
Copy link

well, do want you think what would be best.
I was teaching lyceum/highschool students of 10-19 years age and I'm quite sure what they are capable of or not (at least in germany) ;)
I honestly don't see what in a formula like

average = filterweight * (sensorReading) + (1 - filterweight ) * average; // choose filterweight 0...1

could be misleading.

But I know for sure what would be misleading using "confidence", especially for non-English speakers. you might also call it "prayer", with the same result.

@NicoHood
Copy link
Contributor

average = filterweight * (sensorReading) + (255 - filterweight ) * average;
0-255 and the program is fast. How come that? \o/
IF you decide to work on a microcontroller, then please do it properly from the beginning.

Basic math guys, basic math... I think students in that age will understand that. If not you are a bad teacher if you are not able to explain that in a minute. And if they dont understand they are in the wrong class or should read the if - else section again.

@shiftleftplusone
Copy link

do what you want, I only can give advices.
I don't care about AVRs and their limitations, my Due is closely to fast enough, and a Teensy or a Zero or a Yun also is supposed to be. I don't care about embedded systems by rediculous performance not being able to handle floats., and handling of floats is indispensible and a minimum requirement (such as sin, cos, atan, sqrt, pow).

@tigoe
Copy link
Member

tigoe commented Oct 12, 2015

Sorry, @vogonjeltz, I didn't mean to insult your experience. I get agitated on this issue.

@q2dg laid it to rest for me above. I'm fine using "filterWeight" as long as the comments at the top of the sketch read something like that Wikipedia quote. Modified, I'd say:

"filterWeight is a measure of the importance or "weight" of one sensor reading relative to the others. A reading's weight is measured on a scale from 0 (i.e. no value at all) to 1 (i.e. more important than all the others)." (I think I got that backwards, but that's the general idea.)

I think what you said about 0.7 - 0.9 is useful, but I want to find more explicit language for it.

@shiftleftplusone
Copy link

first comes the equation, then comes the explanation. You might wish to work out the explanation, from time to time, as time goes by....

@shiftleftplusone
Copy link

The filter recurrence relation provides a way to determine the output samples in terms of the input samples and the preceding output. The following pseudocode algorithm simulates the effect of a low-pass filter on a series of digital samples:

// Return RC low-pass filter output samples, given input samples,
// time interval dt, and time constant RC
function lowpass(real[0..n] x, real dt, real RC)
var real[0..n] y
var real α := dt / (RC + dt)
y[0] := x[0]
for i from 1 to n
y[i] := α * x[i] + (1-α) * y[i-1]
return y

The loop that calculates each of the n outputs can be refactored into the equivalent:

for i from 1 to n
y[i] := y[i-1] + α * (x[i] - y[i-1])

That is, the change from one filter output to the next is proportional to the difference between the previous output and the next input. This exponential smoothing property matches the exponential decay seen in the continuous-time system. As expected, as the time constant \scriptstyle RC increases, the discrete-time smoothing parameter \scriptstyle \alpha decreases, and the output samples \scriptstyle (y_1,, y_2,, \ldots,, y_n) respond more slowly to a change in the input samples \scriptstyle (x_1,, x_2,, \ldots,, x_n); the system has more inertia. This filter is an infinite-impulse-response (IIR) single-pole low-pass filter.

https://en.wikipedia.org/wiki/Low-pass_filter#Discrete-time_realization

@tigoe
Copy link
Member

tigoe commented Oct 12, 2015

Right. I'll come back and turn that into something more digestible later on.

@shiftleftplusone
Copy link

remember, it's a formula. call the terms and multiplicants as you want but keep away from being suggestive / evocative. a factor is a factor is a factor. Explain what it does by which effecs, that's the simple point.

@tigoe
Copy link
Member

tigoe commented Oct 12, 2015

Clearly we have different ways of teaching. I've found that being suggestive works very well, when the suggestion is in the students' experience. I spend much of my time looking for the right analogies.

@shiftleftplusone
Copy link

not if the suggestive words lead to misleading, such as "confidence"
OMG, what have I do to believe and to be confident?
Maybe a prayer might help?

@shiftleftplusone
Copy link

start with the equation, then we may work on the explanation.
Start with neutral non-confidential expressions.

@tigoe
Copy link
Member

tigoe commented Oct 12, 2015

I don't think we're going to reach agreement on this. I'm therefore going to bow out of the conversation, so I don't offend anymore.

@shiftleftplusone
Copy link

call it filterfactor or smoothingfactor or just x.
And say: x is the smoothing factor.
High x close to 1: almost no smoothing., but still reacting quickly to changing values.
Low x close to zero: heavy smoothing, eliminating almost all noise peaks, but very slowly reacting to quickly changing sensor values..
Try them start by 0.9 or 0.7, and adjust it by your experience and your observations.
No filter by either formula will EVER fit to all possible empiric situations.

and keep in your mind:
Sketch is C
And C is fucking obfuscated.
And maths is fucking abstract and generalizing.

The earlier people internalize this, the better.

So don't put your explanations into your equation identifiers.

@tigoe
Copy link
Member

tigoe commented Oct 13, 2015

Okay, spent the evening testing @bdlow's int version and @vogonjeltz's float version using a 3G accelerometer. I did a graphing sketch in P5.js to test it out, you can find my sketch and tests here: https://github.com/tigoe/GraphingSketches/tree/master/WeightedAverageGraph. You can see some images of the float version in the directory as well.

The int version gives more clearly defined abstractions of the signal, i.e. you can filter the transitions and peaks easier from the average. But the actual values are further off from the unfiltered values. The float version tends to change more closely with the sensor values, even with a high value. Of the two, I think I'd prefer the int version, after testing. I think in the long run, it's probably more useful to people looking to filter events out of their sensors. I'd recommend switching to that version, and leaving the filtering weight value at about 16 to 32.

In trying to make sense of the math above, I stumbled upon a version I did in 2005, and modified again in 2012: http://www.tigoe.com/pcomp/code/arduinowiring/37/ And even better the paper on which I based that version, which includes a fairly readable explanation of what's going on: http://home.earthlink.net/~david.schultz/rnd/2002/KalmanApogee.pdf In that paper, p.8, "Filtering the Data", he explains what's going on better than anything else I could find.

Ultimately, I'll leave it up to @ffissore, @agdl , and @cmaglie. If you think the change is worthwhile, feel free to use either of these, with credit to the appropriate authors. I'm happy with them both.

@shiftleftplusone
Copy link

what'd I say?

Filtering the Data
First of all lets look at using a simple filter to average several pressure sensor readings and smooth out the data. The filter I have chosen is an recursive filter since it is very simple and doesn't require a lot of data storage or processing so it can be easily implemented on even the simplest micro-controller. This filter is roughly equivalent to a moving average filter except that it does not require the storage of prior samples. The basic idea is to estimate the current altitude by taking a weighted average of our last estimate and the current ADC reading. The formula for this is:

 P_[n+1]  = x * ADC + (1-x) * P_[n] 

where:
P_[n] is our last estimate
P_[n+1] is our new estimate
ADC is the latest reading from the ADC
x is the weight ( 0 <= x <= 1)
Note that when x is 0.5 we get:
P_[n+1] = (ADC + P_[n] )/2
or just the average of the two values.
So how does it work?
The weight x can be considered the value of our confidence in the measurement from the ADC. If we are
sure that it is absolutely accurate, the weight (x) would be 1. So we would always believe the ADC and ignore the last estimate.
At the other extreme would be a weight of 0. This would mean that we have absolutely no confidence in our measurement so we stick with our last estimate. Needless to say the estimate is then a constant value.
Reality lies somewhere in between these two extremes.
Also, as the value of x is decreased, the response of the filter slows down. Thus there has to be a balancing act. If we use a very small value of x the data will be very smooth but it will be delayed so much that we will always detect apogee late.

@tigoe
Copy link
Member

tigoe commented Oct 13, 2015

Yes. He said it more accessibly than you.

@shiftleftplusone
Copy link

yes, I'm not English, and explaining it is your job, but the main point actually was the equation.
He took the same formula as I and he called it just x - and explained how the factor x works.

I explained it above in short words

newavrg = eta * (newval) + (1-eta) * oldavrg

"by eta=0.5 its just the arithmetic average (mean), by 0.9 the current readings are more strongly weighted, and by eta < 0.5 it's just an extremely idle/dull filter, showing almost no peaks any more."

He said the same but more verbose. Anyway, the main discussion was about the right formula to be taken.

@tigoe
Copy link
Member

tigoe commented Oct 13, 2015

Agreed. For me, I need the more verbose description, which is why his helped me. Apologies for my shortcomings that way, I didn't mean to aggravate you.

@agdl
Copy link
Member

agdl commented Oct 28, 2015

I'm a little bit scared in writing something here :)

I wanted to write only too many comments and close the issue, but it is not too polite...

The showed examples somewhere in the Arduino website are intended for people who firstly approach micro-controllers programming like said by @tigoe so it's a good idea to keep them more simple as possible. If someone wants to improve them can easily search the web, books or whatever, however I think the most of them are a good starting point.

So based on this I will leave it as is. In this way everyone can say: "ok it sucks like @vogonjeltz" and no one can say "I can't understand why you chose the algorithm written by someone instead of the one written by someone else".

Please don't start again a one way discussion about I'm right and you are wrong :)

@agdl agdl closed this as completed Oct 28, 2015
@shiftleftplusone
Copy link

@agdl - what did you mean?

Do you honestly expect a beginner to understand how this works?

for (int i = 0; i < numReadings; i++) {
    average = average + (analogRead(inputPin) - average) >> filterWeight;
  }

as I am no advanced programmer, just a couple of years C-like programming, and I don't understand at all what this is about to do. And I wouldn't assume any beginner to understand what this is about.

But again, honestly, this is the way one usually uses this filter, you can read it it either tuutorial or source reference:

avrg = x * (newval) + (1-x) * avrg  // for x >= 0 and x <= 1
  • and it's absolutely clear to see what it does.
    It's not why I say it's right or why You maybe don't understand - or either beginner - it's just better and more easy and more common this latter way:
    a simple mathematical weighted average (mean).
  • the first formula instead is absolutely uncomprehensible, especially for a beginner's tutorial and beginner's skills !

@agdl
Copy link
Member

agdl commented Oct 29, 2015

Uh? The formula in now on-line is the basic average value avg = sum/n written as:

 // calculate the average:
  average = total / numReadings;

@shiftleftplusone
Copy link

not the basic average (mean) - the weighted average, in this case the weighted floating average.
for x=0.5 it's the basic average (mean) though:

avg = 0.5*SensorValue + (1 - 0.5) * avg_old 
    = 0.5*(SensorValue + avg_old) 
    = (SensorValue + avg_old) / 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Documentation Related to Arduino's documentation content
Projects
None yet
Development

No branches or pull requests

9 participants