Well I find the wikipedia page:

https://en.wikipedia.org/wiki/PID_contro...ral_windup fairly understable, but you might not.

A quick explaination is that you have two variables: a control variable and a measured variable. The control variable is the one that you can control and the measured variable is the one you

want to control. For example, pitch can be considered a control variable, in order to change the measured variable altitude. You increase pitch to increase altitude and decrease pitch to decrease altitude, etc. etc.

The core of a PID controller is this equation:

Output = Kp * e(t) + Ki * e(T) + Kd * e'(t)

It's a bit mathy, but breaks down fairly easily: Kp, Ki, and Kd are the tuning values and therefore constant.

e(t) is the "error at time t", i.e. now. It's the difference between the measured valued and the target (or setpoint) value.

e(T) is the "integral error", which is just a fancy way of saying it's the accumulated error over time. The most basic implementation for this is to just add the error to an accumulator each time.

e'(t) is the derivate of the error, in this case, just the rate of change. Normally you can just use the difference between the previous error and the current error.

The way it works is that "Kp * e(t)" is the "proportional" term, so you alter your control variable based on how far you are from your target, this can result in oscillation around the target value and/or a failure to settle at the target value at all.

"Ki * e(T)" is the "integral" term, which improves the rate at which the target value is approached and prevents the value from being persistently under or over the target. It's clear how this works if you consider the fact that e(T) accumulates past error. Lets say you have a system that persistently stays under the target value, with an error of 1, after the first tick, the accumulated error is 1, the next tick it's 2, then 3, 4, 5, etc. As the accumulated error increases so does the control value. However, the integral term can increase overshoot and settling time.

"Kd * e'(t)" is the "derivative" term, which accounts for the rate at which the error is changing in order to compensate. This helps avoid overshoot and decreases settling time, at the expense of increasing the time it takes to reach the target value (though not normally by much). This is essentially like braking to a stop in a car. You don't wait until you're at the place you want to stop then slam the brakes on, you start braking ahead of time.

Hopefully that should get you started. They're actually quite easy to implement, at least in a basic form, playing with the values and seeing the effects can be pretty enlightening.