The isolation and characterization of the complementary DNAs (cDNAs) and gene which code for an epithelial tumor antigen (H23-ETA), aberrantly expressed in human breast tumor tissue, are described here. A diversity of H23-ETA protein forms, is generated by a series of alternative splicing events that occur in regions located upstream and downstream to a central tandem 20 amino acid (aa) repeat array (TRA) that is rich in proline, serine and threonine residues. The upstream region shows that differential usage of alternative splice acceptor sites generates two protein forms containing putative signal peptides of varying hydrophobicities located at the NH2 terminus. The region downstream to the tandem repeat array indicates that one mRNA transcript is collinear with the gene and defines a 160 aa open reading frame (secreted or sec form). A second cDNA correlates with a mRNA that is generated by a series of splicing events and codes for 149 aa downstream to the TRA, identical with the aa sequence of the unspliced cDNA, after which it diverges and continues for an additional 179 aa. This sequence (transmembrane or tm form) contains a highly hydrophobic transmembrane domain of 28 aa followed by a hydrophilic “transfer-stop signal” (Arg Arg Lys) and a cytoplasmic domain of 72 aa. The various protein forms (alternative signal sequences, secreted and transmembrane) are likely routed to different cytoplasmic, cell membrane and extracellular compartments. Reverse PCR indicates that the relative ratios of the alternatively spliced forms vary in different epithelial tissues. To identify the individual protein species, monoclonal antibodies (mAb) are being generated against synthetic peptides unique to each form. The H23-ETA gene was also isolated and sequenced, demonstrating a putative promoter region that includes a ‘TATA’ box, Spl binding elements and an upstream putative hormone responsive element. Commensurate with these findings, H23-ETA expression was increased following hormonal treatment of BT549 breast tumor cells. These molecular studies have unravelled novel H23-ETA protein and gene structures, and facilitate future investigations that will focus on H23-ETA function and interaction with other cellular proteins.