15.12.17 - DHI and the LIDIAP are happy to congratulate Gülcan Can for passing her PhD.

Photo from left to right: Dr. Jean-Marc Odobez (Idiap-EPFL), Dr. Gülcan Can, Prof. Daniel Gatica-Perez (Idiap-EPFL)

After passing the private defense, Gülcan presented publicly on Monday 27 November 2017. The title of her thesis is "Visual Analysis of Maya Glyphs via Crowdsourcing and Deep Learning".

Abstract of thesis

In this dissertation, we study visual analysis methods for complex ancient Maya writings. The unit sign of a Maya text is called glyph, and may have either semantic or syllabic significance. There are over 800 identified glyph categories, and over 1400 variations across these categories. To enable fast manipulation of data by scholars in Humanities, it is desirable to have automatic visual analysis tools such as glyph categorization, localization, and visualization. Analysis and recognition of glyphs are challenging problems. The same patterns may be observed in different signs but with different compositions. The inter-class variance can thus be significantly low. On the opposite, the intra-class variance can be high, as the visual variants within the same semantic category may differ to a large extent except for some patterns specific to the category. Another related challenge of Maya writings is the lack of a large dataset to study the glyph patterns. Consequently, we study local shape representations, both knowledge-driven and data-driven, over a set of frequent syllabic glyphs as well as other binary shapes, i.e. sketches. This comparative study indicates that a large data corpus and a deep network architecture are needed to learn data-driven representations that can capture the complex compositions of local patterns. To build a large glyph dataset in a short period of time, we study a crowdsourcing approach as an alternative to time-consuming data preparation of experts. Specifically, we work on individual glyph segmentation out of glyph-blocks from the three remaining codices (i.e. folded bark pages painted with a brush). With gradual steps in our crowdsourcing approach, we observe that providing supervision and careful task design are key aspects for non-experts to generate high-quality annotations. This way, we obtain a large dataset (over 9000) of individual Maya glyphs. We analyze this crowdsourced glyph dataset with both knowledge-driven and data-driven visual representations. First, we evaluate two competitive knowledge-driven representations, namely Histogram of Oriented Shape Context and Histogram of Oriented Gradients. Secondly, thanks to the large size of the crowdsourced dataset, we study visual representation learning with deep Convolutional Neural Networks. We adopt three data-driven approaches: assess- ing representations from pretrained networks, fine-tuning the last convolutional block of a pretrained network, and training a network from scratch. Finally, we investigate different glyph visualization tasks based on the studied representations. First, we explore the visual structure of several glyph corpora by applying a non-linear dimensionality reduction method, namely t-distributed Stochastic Neighborhood Embedding, Secondly, we propose a way to inspect the discriminative parts of individual glyphs according to the trained deep networks. For this purpose, we use the Gradient-weighted Class Activation Mapping method and highlight the network activations as a heatmap visualization over an input image. We assess whether the highlighted parts correspond to distinguishing parts of glyphs in a perceptual crowdsourcing study. Overall, this thesis presents a promising crowdsourcing approach, competitive data-driven visual representations, and interpretable visualization methods that can be applied to explore various other Digital Humanities datasets.

Source:Institute of Digital Humanities