Minutes of the meeting of the International Commission on Computer Supported Processing of Mediæval Slavonic Manuscripts and Early Printed Books to the International Committee of Slavists, held virtually on September 15th, 2020

Members paticipating:

Prof. Sorin Paliga left after the first part of the meeting

The first part of the Meeting was open to the people outside the Commission.

Prof. David Birnbaum gave a presentation on CollateX: collation in editions and the Gothenburg model. After the presentation, Prof. Achim Rabus opened the discussion with a question on the algorithm and whether a normalisation step was needed if AI was to be used instead. Prof. Birnbaum answered that machine learning tasks improved with the training set becoming larger, so very large training sets were needed, with enough data, while the available sets were relatively small. There was also a comment by Prof. Ralph Cleminson (on the ability of AI to break the Gothenburg model).

The next talk of Prof. Achim Rabus on the Transkribus project presented the project, the HTR software and models designed to transcribe different kinds of pre-modern Slavonic sources. Questions were asked by Prof. Sebastian Kempgen and Prof. Hanne Martine Eckhoff (whether on can turn the language model off because it could lead to erroneous results), and Prof. Matija Ogrin.


The second part of the Meeting was dedicated to the Commission’s activities and Prof. Rabus asked only the members to stay online.

The report for the period 2018 – 2020 was distributed beforehand. In his review of the Commission’s various activities, Prof. Rabus asked the members to consider more options for collaborations and especially initiatives on mass digitisation. Prof. Cleminson noted that there are different approaches to digitisation as it is not enough to have the same thing in digitised form and there is a difference between digitisation of the manuscript and digitisation of the edition. Dr. Eckhoff commented that the corpus linguistics perspective should be taken into account as well.

Prof. Rabus proposed Prof. Roland Meyer (Humboldt University), the creator of the well-known Russian diachronic corpus (RRuDi) as a potential new member of the Commission – which was approved unanimously. Prof. Sebastian Kempgen asked whether it is possible for other new members to be accepted; he nominated Dr. Jürgen Fuchsbauer, who has since accepted the offer to become a member of the Commission.

Prof. Rabus recommended to make use of the Commission website (https://www.obshtezhitie.net/) as a hub for the projects of Commission’s members and Prof. Cleminson agreed to continue to serve as a webmaster. The Commission members are asked to send all pertinent information to the Secretary (Dr. Tsvetana Dimitrova) and to Prof. Rabus and Prof. Cleminson.

Prof. Rabus opened a discussion on encoding and fonts by request of Prof. Cynthia Vakareliyska who had made a keyboard driver but had encountered some problems. Prof. Kempgen, Prof. Cleminson and Prof. Bunčić participated in the discussion on the keyboard drivers. Prof. Bunčić has made a keyboard driver, based on Unicode, but it doesn’t work on Mac. An opinion was formed around the idea that it would be good if a single driver were to be used, with convertibility in other fonts (at least other Unicode fonts). Prof. Kempgen said that we have to distinguish between issues related to Unicode/encoding and keyboard drivers; Unicode has become a more or less stable system with complete sets, but the needs of the users will be different because of different keyboards used (Russian, Bulgarian, etc.), and gave a brief overview of the current state of affairs.

Prof. Rabus proposed forming a subcommittee to discuss these issues with representatives from different keyboard use traditions, incl. Russian, Bulgarian and the Western ones. Prof. Miltenova proposed Prof. Andrej Bojadžiev from Bulgaria, and Prof. Birnbaum backed the candidacy as Prof. Bojadžiev is a Linux user and he would be needed as such. Prof. Kempgen added that Windows and Mac users would be needed as well. The following members of the subcommittee on issues around fonts and keyboards were proposed: Prof. Sebastian Kempgen, Prof. Cynthia Vakareliyska, Prof. Andrej Bojadžiev, Prof. Daniel Bunčić, Prof. Ralph Cleminson, Prof. Sorin Paliga. The subcommittee has already started its work. Prof. Lyashevskaya proposed Prof. Baranov to be also a member but Prof. Baranov offered to participate as a consultant only. Prof. Cleminson asked the members to take into account different users’ needs and expectations.

During the overall discussion, Prof. Sebastian Kempgen asked about any feedback from the International Committee. Prof. Rabus informed the members that there was an answer by Prof. Peter Žeňuch. Prof. Bunčić added that it indicated that the Commission was an active one. Prof. Cleminson proposed that the statute of the Commission be modified with respect to issues related to corpus linguistics, historical linguistics and AI. The officers (especially Rabus/Dimitrova) will prepare a first draft and have it circulated.

Prof. Miltenova proposed putting online (on the Obštežitie website) the pdf files of the volume of proceedings of the last edition of the El’ Manuscript conference in Vienna in 2018 (“Digital and Analytical Approaches to the Written Heritage”). Unfortunately, the issues of “Scripta & e-Scripta” journal cannot be put online free of charge because they are subject of a contract agreement between the Institute of Literature of the Bulgarian Academy of Sciences and the CEEOL database.

Prof. Rabus informed the members about different options for the postponed edition of the El’ Manuscript conference to be held in April 2021: hybrid; completely digital; or to be postponed again. All members backed the hybrid mode of the El’ Manuscript conference in April 2021. Prof. Birnbaum claimed that it was possible and doable when taking account of the time zones of the participants and audience. Prof. Rabus added that even pre-recorded sessions can be added.