====== Indexing of the CERL Thesaurus ====== This page is still under construction... ===== Available Search Fields ===== The CERL Thesaurus comes with a complex index that allows for sophisticated searches. The following search fields are available: ^ field ^ all words ^ Anaylzer ^ Type ^ Fields ^ Description ^ | active_in | no | standard | string/id | 515$3 \\ with $0 'actv','resd','teac','schl','stud' or 'trad' | id of a place of activity of an entity \\ ''active_in:cnl00028786'' | | address | yes | standard | string | 515$a$d$e$r | search within the address of an entity \\ ''address:"Kerk-straat"'' | | based_in | no | standard | string/id | 515$3 \\ with $0 'dioc' | id of the place a diocese or parish is based in \\ ''based_in:cnl00014398'' | | born_in | no | standard | string/id | 515$3 \\ with $0 'brth' | id of the place a person was born in \\ ''born_in:cnl00008322'' | | child_of | no | standard | string/id | 500$3 \\ with $0 'ex:hasParent' | id of a person who is a parent to another person | | collaborator_of | no | standard | string/id | 500$3,510$3 \\ with $0 'ex:hasCollaborator' | id of a person or printer who has worked together with another person or printer | | corporateName | yes | standard | string | 212$a$b$e$r, 412$a$b$e$r | name (heading or variant) of a corporate entity (word) \\ ''corporateName:university'' | | corporateName.orig | no | none | string | 212$a$b$e$r, 412$a$b$e$r | name (heading or variant) of a corporate entity (phrase) | | dedup | no | standard | string/code | 001c/p0-2 \\ with 831#1 present in the record | type of a record that is marked for deduplication \\ ''dedup:cnl'' | | died_in | no | standard | string/id | 515$3 \\ with $0 'deat' | id of a place where a person died | | duplicate_of | no | standard | string/id | 831#1$a | id of a record that could be a duplicate of another record | | external_id | yes | standard | string | 801$b$c | ID from an external file \\ ''external_id:gnd'' or ''external_id:(gnd 1029934118)'' | | feature | no | standard | string/code | 956, 516 | a certain feature/chunk of information of a record \\ ''prdv'' Printers' Devices \\ ''prov'' Provenance Information \\ ''imag'' Image \\ ''feature:prdv'' | | gender | no | standard | string/code | 120$a | a person's or printer's gender (code) \\ ''a'' female \\ ''b'' male \\ ''u'' unknown \\ ''gender:a'' | | has_member | no | standard | string/id | 512$3 \\ with $0 'ex:isMemberOf' | id of a corporate body that has members | | id | yes | standard | string/id | 001, 035$z | CERL Thesaurus ID \\ ''id:cnc00006222'' | | imprintName | yes | standard | string | 210$a$b$e$r, 410$f$a$b$e$r | a name of a printer/bookseller etc. (heading or variant) (word) \\ ''imprintName:sermartelli'' | | imprintName.orig | no | none | string | 210$a$b$e$r, 410$f$a$b$e$r | a name of a printer/bookseller etc. (heading or variant) (phrase) | | last_changed | no | standard | date | 999$a$b \\ only the last occurrence of 999 | date a record has last been changed \\ ''last_changed:[2017-12-01 TO * ]'' all records changed at or after Dec, 1st 2017 | | lived_in | no | standard | string/id | 515$3 \\ with $0 'resd' | id of a place where a person lived | | location | no | standard | geopoint | 123$d$f | Geographic coordinates of a place | | name | yes | standard | string | 200/210/212/215$a$b$e$r, 400/410/412/415$f$a$b$e$r | name of an entity (word). This index is a combination of personalName, imprintName, corporateName and placeName \\ ''name:hamburg'' | | name.orig | no | none | string | 200/210/212/215$a$b$e$r, 400/410/412/415$f$a$b$e$r | name of an entity (phrase). This index is a combination of personalName.orig, imprintName.orig, corporateName.orig and placeName.orig | | name_display_line | yes | standard | string | 200/210/212/215$a$b$e$r \\ only the first occurrence | name of an entity (word). This index field is used primarily for display, not search | | name_display_line.orig | yes | none | string | 200/210/212/215$a$b$e$r \\ only the first occurrence | name of an entity (phrase). This index field is used primarily for display, not search | | note | yes | standard | string | 300$a,350$a,356$a | a general note, activity note or geographic note \\ ''note:printer'' | | parent_of | no | standard | string/id | 500$3 \\ with $0 'ex:hasChild' | id of a person that is the child of another person | | placeName | yes | standard | string | 215$a$e$r, 415$a$e$r | name (heading or variant) of a place (word) \\ ''placeName:hamburg'' | | placeName.orig | no | none | string | 215$a$e$r, 415$a$e$r | name (heading or variant) of a place (phrase) | | personalName | yes | standard | string | 200$a$b$e$r, 400$f$a$b$e$r | name (heading or variant) of a person (word) \\ personalName:aristoteles | | personalName.orig | no | none | string | 200$a$b$e$r, 400$f$a$b$e$r | name (heading or variant) of a person (phrase) \\ personalName:aristoteles | | predecessor_of | no | standard | string/id | 500/510/512$3 | id of the entity that is the successor of another entity | | record_flag | no | standard | string | 830$6 | a specific record marker \\ ''record_flag:ba18'' | | related_to | no | standard | string/id | 500/510/512/515$3 | id of a record related to another record (backlinks) \\ ''related_to:cnl00032270'' | | school_in | no | standard | string/id | 515$3 \\ with $0 'schl' | id of a place where a person visited a school | | sign | yes | standard | string | 516$a$n | description of a sign/marks/device etc \\ ''sign:tortuga'' | | spouse_of | no | standard | string/id | 500$3 \\ with $0 'ex:hasSpouse' | id of a person that has been married to another person | | studied_in | no | standard | string/id | 515$3 \\ with $0 'stud' | id of a place where a person visited a university | | subordinate_to | no | standard | string/id | 512$3 \\ with $0 'ex:hasSuperiorHierarchicalLevel' | id of a corporate entity that has a sub division | | successor_of | no | standard | string/id | 500/510/512$3 \\ with $0 'ex:hasPredecessor' | id of an entity that is a predecessor to another entity | | superior_to | no | standard | string/id | 512$3 \\ with $0 'ex:hasSubordinateHierarchicalLevel' | id of a corporate entity that is a sub-division of another corporate entity | | tought_in | no | standard | string/id | 515$3 \\ with $0 'teac' | id of a place where a person was teaching (e.g. at a school or university) | | traded_in | no | standard | string/id | 515$3 \\ with $0 'trad' | id of a place where another entity conducted business | | type | no | standard | string/code | 001c/p0-2 | record type \\ ''type:cnp'' | | visited | no | standard | string/id | 515$3 \\ with $0 'vist' | id of a place a person has visited | | year_end | yes | standard | date | 340$x | search for entities whose activity or existence ended before, in or after a certain year \\ ''year_end:>1800'' \\ ''year_end:<1500'' | | year_start | yes | standard | date | 340$x | search for entities whose activity or existence started before, in or after a certain year \\ ''year_start:[1530 TO 1560]'' | In addition to these searchable fields, the index contains the followin fields, which cannot be searched, but are used to generate the result set display: |additional_display_line|additional information on the entity| |name_display_line|name of the entity| ===== Query Syntax ===== Search fields of the type **string** (except those ending in ''.orig'') are word indexed, case-insensitive and diacritics are replaced by the base character. They can be searched by just typing the search term after the search key's name, separated by a colon: * ''name:hamburg'' the term //hamburg// should appear in the //name:// field * ''personalName:(visser jacob)'' the terms //visser// and //jacob// should appear in the //personalName:// field in any order, regardless if adjacent or not * ''imprintName:"sermartelli e fratelli"'' the terms //sermartelli e fratelli// should appear in the //imprintName:// field in exactly this order The fields named **.orig** contain the original field's content as a phrase, case-sensitive, diacritics are not replaced: * ''imprintName.orig:"Bartolommeo Sermartelli e fratelli, Firenze"'' Search fields of the type **date** allow for range searches: * ''last_changed:[2017-05-1 TO 2017-05-31]'' all records that have last been edited //in May 2017// * ''year_start:<1600'' all entities that were born or started their activity //before the year 1600//. Search terms can be combined by using the **Boolean Operators** ''AND'', ''OR'' and ''NOT''. Please note that these must always be written in upper case. * ''year_start:<1600 AND feature:prdv'' all records where the activity/birth date is before 1600 and that contain descriptions of printers devices * ''name:sermartelli AND name:(bartolomeo OR michelangelo)'' all records where the name field contains //Sermartelli// and either //Bartolomeo// or //Michelangelo//. **Truncation/Wildcards** can be used as well. Please that using wildcards will slow down the response time remarkably. * ''?'' replaces exactly one character * ''*'' replaces any number or characters (or none) A more detailed description of the query syntax can be found here: https://www.cheatography.com/jelle/cheat-sheets/elasticsearch-query-string-syntax/ ===== Searching Relationships ===== A number of the search keys listed above allow to retrace the relationship between the entities described in the CERL Thesaurus. It is assumed that one part of the relationship is known, e.g. a person or a place and using this person's or place's id, we can retrieve the records that link back to it. For example: Knowing the ID for the place "Tübingen", we can retrieve persons who went to university here: ''studied_in:cnl00014526''