The Swiss Text Corpus mainly focuses on collaboration and open standards in order to be able to use the best technology possible for its purposes.
Open Standards in text and corpus technology are indispensable for the provision of sustainable digital resources. Like many other corpus projects the Swiss Text Corpus annotates the XML versions of its documents according to the Text Encoding Initiative (TEI). The scans with the underlying OCR text are stored as archivable PDFs (PDF/A according to ISO 19005-1).
We mainly use open-source software and/or software from our partner projects for the processing and publication of the corpus in the Internet. The search interface of the Swiss Text Corpus is based on the web framework Django. The linguistic search engine for the indexation of the corpus texts in the background is DDC, developed by our Berlin partner project DWDS.
We are very open for any kind of know-how exchange in the field of corpus technology.