How much "rest" time occurs before and after the onset of the continuous stream of stimuli (that is, at the beginning and end of the imaging run)? Probably not much, I'd guess.
In this situation, IMHO you can only reasonable find the difference between the linguistic condition #1 and the foil condition #2 -- that is, you can't find "linguistic only" activation, just places where "linguistic minus foil" is significantly nonzero.
There would be two ways to analyze this in AFNI. The first, which I would try since it is simpler, would be to treat condition #2 as "rest" -- that is, don't put in stimulus timing for it. Then either excise the actual rest periods at the run starts/ends and analyze that data as normal -- with BLOCK(20,1) timing for condition #1 -- or put in BLOCKs to catch the presumably small un-excised "true" rest periods.
The second would be to put in BLOCK(20,1) timing for the 2 different stimulus classes (in separate files, of course), and then modify the default AFNI baseline generation to remove the constant term in the baseline model -- that is, keep the drift terms like t
1, t
2, etc., but get rid of t
0=1. This removal is possible, but requires a more tricksy usage of 3dDeconvolve and afni_proc.py (I've never had a request like this before) -- which is why I would start with the first method and see if the results are reasonable.
I hope this explication is clear.