Saturation of signed result

Saturation logic very common in DSP datapaths in which the results of some filter operation (not necessarily only filter) will be clipped between two saturation limits before writing into the result registers. Two possible scenario we can look at.
  1. A signed result to be clipped between two unsigned values (typical case video processing when the result is a pixel)
  2. A signed result to be clipped between two signed values
Case 1: Clipping between unsigned numbers
If the bit width of the result to be clipped is more than the final result after clipping, then its necessary to check positive overflow.
Example:
a_unsat[11:0] is the signed result before saturation.
a_sat[7:0] is the unsigned result after saturation.
upper_limit[7:0] and lower_limit[7:0] both are unsigned saturation limits.

Though the limits are 8 bit unsigned and the unsaturated result is signed, we can simply compare the [7:0] bits of the a_unsat with the limits for clipping. Because, since the clipped result is going to unsigned, any negative unsaturated value will be assigned with lower_limit and if the unsaturated result is greater than 8 bits maximum, its going to be indicated by the positive overflow signal.

pos_ovf = ~a_unsat[11] & |a_unsat[10:8]

finally, the clipping function look like this.

function [7:0] unsgn_sat_res;
input [11:0] in;
input [7:0] up_limit;
input [7:0] low_limit;

reg pos_ovf;

pos_ovf = ~in[11] & |in[11:8];

if(in[11] | (in[7:0] <= low_limit))
unsgn_sat_res = low_limit;
else if(pos_ovf | ( in[7:0] >= up_limit))
unsgn_sat_res = up_limit;
else
unsgn_sat_res = in[7:0];

endfunction

Case 2: Clipping between signed limits
If the bit width of the result to be clipped is more than the final result after clipping, then its necessary to check positive overflow and negative overflow.
Example:
a_unsat[11:0] is the signed result before saturation.
a_sat[7:0] is the signed result after saturation.
upper_limit[7:0] and lower_limit[7:0] both are signed saturation limits.

To find the positive overflow, we need to consider the bits [10:7] of the unsaturated result.
pos_ovf = ~a_unsat[11] & (|a_unsat[10:7])
To find the negative overflow, we need to consider the bits [10:7] of the unsaturated result.
neg_ovf = a_unsat[11] & (&a_unsat[10:7])
finally, the clipping function look like this.

function [7:0] sgn_sat_res;
input signed [11:0] in;
input signed [7:0] up_limit;
input signed [7:0] low_limit;

reg pos_ovf;
reg neg_ovf;

pos_ovf = ~in[11] & (|in[11:7]);
neg_ovf = in[11] & (&in[11:7]);

if(neg_ovf | ($signed(in[7:0]) <= $signed(low_limit)))
sgn_sat_res = low_limit;
else if(neg_ovf | ( $signed(in[7:0]) >= $signed(up_limit)))
sgn_sat_res = up_limit;
else
sgn_sat_res = in[7:0];

endfunction


This post is for my self reference and for anyone who is interested.

Comments