Doing Web-based Research on Under-Documented Languages

Much of our knowledge of speech and language come from the study of a small subset of the world’s languages. Along with collaborators Hermann Keupdjio (University of British Columbia), Amelia Kimball (Laboratoire de Linguistique Formelle – CNRS Paris), and Constantine Kouankem (University of Yaounde I), we are working to increase the representation of under-documented languages in linguistic research through promoting web-based research with speakers of under-documented and endangered languages. Web-based surveys and experiments can enable researchers to collect data on languages from a distance. This can be especially useful where researchers are unable to travel to collect data on a language due to economic, geographic, and/or barriers. We hope that these efforts will also promote collaboration between researchers working in different areas of the world interested in promoting the study of under-documented and endangered languages. Check our our recent paper presented at the special session on Speech Perception in Underrepresented Populations at the International Congress of Phonetic Sciences in Melbourne, Australia.

Here, too, are some resources for pursuing web-based research on under-documented and endangered languages.

Web-based survey and experiment tools

Amazon Mechanical Turk – A tool for hosting online surveys and experiments via Amazon’s platform, allowing distribution of surveys to ‘workers’ (AMT users) who will do the survey in exchange for payment. While AMT itself offers some functionality for survey-building, most researchers build a survey using a different platform (such as Qualtrics) and host with AMT. More information can also be found here.

Qualtrics – A very powerful and use-friendly online survey tool which allows for incorporation of sound and video stimuli. Requires a subscription.

SurveyMonkey – A popular and user-friendly online survey tool which allows for collection of a variety of response types (multiple choice, radio button, open-ended question, etc.) Some functionality is available without a subscription.

Google Sheets – Similar to SurveyMonkey and allows for unlimited data collection for users with a Google account.

Ibex farm – A very flexible and free tool for running online experiments which allows for display of audio and video stimuli and collection of a variety of response types using pre-built ‘controllers’. Requires some basic JavaScript and HTML knowledge.

PennController – An enriched set of tools for running experiments via Ibex.

PsychoPy – Free Python-based tool which allows for creation of a wide variety of speech perception and psycholinguistic experiments.

jsPsych – Free tool for developing online surveys using JavaScript. Very flexible and allows for display of audio and video stimuli as well as collection of a wide variety of response types using pre-made plugins. Excellent tool for collecting online response time data.

Additional discussion of comparison of these tools can be found here and here.


Other common concerns with running online experiments with speakers of under-documented languages

Many communities where under-documented languages are spoken may lack the resources or infrastructure necessary for running online experiments. There may also be challenges for data collection where a language doesn’t have a writing system or where speakers do not read or write the language. Here are some common questions about running online experiments in under-resourced areas and our suggestions on how to manage the process.

How do I pay participants?

Amazon Mechanical Turk is a great tool for collecting data and paying subjects online, but it is currently only available for use in a limited set of countries (though the list is growing, so check back soon!) Unfortunately, few options exist for transferring money overseas and many, such as Western Union, charge fees and offer very unfavorable exchange rates for users. Transferring money to several participants is both time-consuming and financially burdensome. To avoid this, we recommend finding a trusted community member (ideally a collaborator!) who is local and can receive a transfer of money and distribute to participants. This way, only one transfer (with associated fees) is necessary.

How do I display instructions and stimuli if my participants do not read the language under investigation?

Recordings of instructions can be made and displayed to participants in lieu of written instructions if participants do not read the language under investigation (or if it is not possible to write the language due to lack of an orthography). Alternatively, it may be good to recruit a local research liaison (who can be compensated for their help with the project) to guide participants through the process.

If I won’t be collecting data in the country, do I still need to obtain a research permit if the country where the language is spoken requires one?

The era of online data collection is still new and many countries have not implemented updated regulations regarding the procurement of research permits for individuals doing transnational online research. We stress that researchers should err on the side of caution in this regard and seek permission to conduct online research in the same way they would if they were conducting data collection in the country.