If those blocks are largely separated from each other (e.g., 16 seconds apart), you might be able to extract the time series at a region after removing the confounding effects such as slow drift, head motion, etc.). Another possibility to model the BOLD response for each condition by using basis functions such as TENT/TENTzero, CSPLIN/CSPLINzero.
Gang