--------------------------------------------- # Mutsun Language Database Exports Preferred citation (DataCite format): Warner, Natasha; Butler, Lynnika; Van Volkinburg, Heather; Geary, Quirina (2022). Mutsun Language Database Exports. University of Arizona Research Data Repository. Dataset. 10.25422/azu.data.19880254 Corresponding Author: Natasha Warner, Department of Linguistics, UA, nwarner@email.arizona.edu License: CC BY-NC 4.0 DOI: https://doi.org/10.25422/azu.data.19880254 --------------------------------------------- ## Summary Mutsun language database backup and XML exports. Mutsun is a Native American language spoken near San Juan Bautista, California. The last fluent first- language speaker passed away in 1930. The materials here represent analyzed texts in the language that were documented in field notes of early anthropologists, linguists, and a mission priest working with fluent native speakers. The versions here are XML backups for safe long-term preservation of the information. PDFs for use by language learners and others are archived through the UA Campus Repository (http://hdl.handle.net/10150/665136). Materials are also archived at the California Language Archive (Department of Linguistics, UC Berkeley, see References below) and may be archived at additional archives. A virtual machine containing the FieldWorks FLex software and the database is also available (https://doi.org/10.25422/azu.data.20010236). If you wish to cite this material or use it for additional analyses or publications, please attempt to contact a member of the Mutsun community, starting with co-author Quirina Geary (qgeary@tamien.org). This is not for permission, but to keep the community informed of what is being published about them. References - https://doi.org/10.25422/azu.data.20010236 - http://hdl.handle.net/10150/665136 - https://cla.berkeley.edu/collection/?collid=11308=Materials%20for%20Mutsun%20Text%20Collection - https://escholarship.org/uc/item/7nx2m3gr --------------------------------------------- ## Files and Folders 1. TextsXMLExports_Public.zip XML exports of individual text chapters, intended for long-term preservation of the information associated with each text, in case the Fieldworks Language Explorer (FLEx) software is no longer available - Verifiable generic XML export corresponding to each chapter of the PDF text collection (each “text” within the FLEx “texts & words” database) - Only the public versions are posted here. Community members can obtain access to the community version (containing a small number of additional entries with culturally protected content) through the California Language Archive 2. DictionaryXMLExports.zip XML exports of the English-Mutsun (reversal) dictionary and Mutsun-English dictionary, intended for long-term preservation of the information in the FLEx lexical database, in case the FLEx software is no longer available - ConfigDictionaryXMLExport.xhtml (Mutsun-English dictionary) and ReversalDictionaryXMLExport.xhtml (English-Mutsun dictionary) - Associated .css files that FLEx generates with the dictionary XML export - There is no need for separate public and community versions, since no entire words are omitted from the dictionary 3. Mutsun2022Public 2022-05-24 1653.fwbackup.zip Backup of the entire FLEx project, intended for long-term preservation of all the information in the database. This file can be opened in FLEx by using the "Restore from backup" option. It may also be possible to view and edit the backup file itself as an XML file if FLEx is no longer available. A version accessible to community members, with a few additional entries that contain culturally protected information, is available at the California Language Archive. - Public version filename: Mutsun2022Public 2022-05-24 1653.fwbackup --------------------------------------------- ## Materials and Methods For the Fieldworks project back-up: Fieldworks Language Explorer (FLEx version 9.0.17.455) (https://software.sil.org/fieldworks/download/). --------------------------------------------- ## Contributor Roles The roles are defined by the CRediT taxonomy http://credit.niso.org/ - Natasha Warner, University of Arizona: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Validation, Writing - original draft, Writing - review & editing - Lynnika Butler, University of Arizona and Wiyot Tribe: Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Software, Supervision, Validation, Writing - original draft, Writing - review & editing - Heather Van Volkinburg, University of Arizona: Data curation, Formal analysis, Supervision, Writing - original draft - Quirina Geary, Tamien Nation: Conceptualization, Data curation, Investigation, Methodology, Validation, Writing - review & editing --------------------------------------------- ## Additional Notes There is also be a disk image of the FLEx project placed at archives, including the FLEx software itself to provide an additional method of safe long-term storage. Refer to: Rios, Fernando; Warner, Natasha (2022). Mutsun Language Database Virtual Machine. University of Arizona Research Data Repository. Software. https://doi.org/10.25422/azu.data.20010236 Further explanations: A. Field names used within the text collection: Fields have different names when viewed in FLEx and in the human-readable pdf. The XML export uses many of the same filenames as are seen within FLEx, but with different formatting and organization. For explanation of the content of each field, see the Introduction to the PDF version of the text collection. FLEx PDF Export Word none (line 1 of each entry) Morphemes none (line 2) Lex. Entries none (line 3) Lex. Gloss none (line 4) Lex. Gram. Info. none (line 5) Free Eng. Translation Free Mut. Orig. spell Free Lat. Source Free Sp. Compare Lit. Eng. Source trans. Lit. Lat. Research notes Lit. Sp. not included Lit. Mut. not included Note not included Three fields should not be cited or used by other researchers: - The FLEx field Literal Spanish contains a rough English translation of the translation the original source gave, if it was given in Spanish. However, these have not been checked or completed for all entries, and were only meant to assist members of the research team during analysis, not to be publishable translations. - The FLEx field Literal Mutsun contains a list of which members of the research team edited the entry when (initials and month/year). This is also only for project-internal use, and is not part of a publication. - The FLEx field Note is used for project-internal notes not meant for publication, such as speculations about what morphemes might be in a word when the analysis was unclear and notes about alternative analyses that were considered and why. These were written in informal style and are not for citation. The opaque choice of FLEx-internal field names, such as "Literal Latin," reflects the fact that we needed to encode several types of information that FLEx (at least in its early versions) had no field available for. This is because this project uses archival data rather than fieldwork with living speakers. We needed to encode information such as what page number of which source the entry appeared on, how the original source transcribed the entry, and how the original source translated it. Rather than making custom fields in FLEx, we made use of FLEx's ability to encode both free and literal translations in multiple languages in order to get enough distinct fields for our needs. Thus, "Free Latin" is used to encode what microfilm reel and page number (and what sentence on the page) an entry comes from, not to encode a free translation into Latin, as the field was designed for. For information on how to export from FLEx to PDFs of the texts with the correct formatting, see: https://github.com/nwarner-dpl/FLExExport B. Field names used within the dictionary/lexical database: Fieldnames in the FLEx lexical database (see Introduction to the dictionary for more information on how each field is used): 1. Included in the published dictionary: - Lexeme form: headword - Morph type: root, stem, proclitic, suffix, etc. - Complex forms: any derived headwords containing this morpheme - Complex form type: type of form for derived/inflected/compound headwords - Components: morphemes contained in this derived/inflected/compound headword - Etymology source form: word in another language that a headword is borrowed from - Etymology following comment: language that a headword is borrowed from - Sense: - Gloss: translation for Mutsun-English dictionary and text collection - Reversal entries: translations for English-Mutsun dictionary - Grammatical info: Part of speech - Example: one or more example sentences if good examples are available - Translation: translations of example sentences - Scientific name: scientific name of plant/animal, if available - Anthropology note: Cultural information related to the headword if available - Grammar note: information on usage with suffixes or in sentences if needed - Phonology note: information on pronunciation if the headword contains unexpected sound sequences - Semantics note: more specific meaning or information if meaning is unsure - Sociolinguistics note: Information about usage of the headword in different time periods - Status: Attested only once/Me only/etc.: notes about entries that are particularly unsure. See Introduction to the Dictionary for a list. - Lexical relations: Cross-references to words of similar meaning, native/borrowed word pairs, verb/noun pairs - Variants: Allomorphs: - Stem allomorph: allomorph other than the headword form - Environments: Environment in which to use the listed allomorph 2. Fields partially/inconsistently filled in, not included in published dictionary: these are fields that we thought of using at one point during the project, but did not use in the end. Information in these fields has not been checked or completed, and should not be relied on or cited by other researchers. These are not intended for any publication, but are part of the backups of the project. - Pronunciation and its subfields - Sense: - Definition (should be equal to Gloss or blank) - General note (English): Project-internal notes made while working on the entry, including speculations about analyses and alternative analyses considered, not meant to be cited or used further - General note (Spanish): date on which the lead researcher approved the entry as having been checked and being ready for inclusion in the dictionary - Source: source(s) that gave evidence of the word. This has been superseded by the ability to make concordances on each headword. - Semantic domains: has not been filled in consistently as of May 2022 - Anthropology categories: has not been filled in consistently as of May 2022 3. Some other fields are filled in automatically by FLEx, such as Date created and Date modified and Publication settings.