ContentWord x Transcript Representation