Property talk:P443
Documentation
audio file with pronunciation
(?i).+\.(ogg|oga|flac|wav|opus|mp3)
”: value must be formatted using this pattern (PCRE syntax). (Help)List of violations of this constraint: Database reports/Constraint violations/P443#allowed qualifiers, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P443#Entity types
List of violations of this constraint: Database reports/Constraint violations/P443#mandatory qualifier, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P443#Conflicts with P31, SPARQL
add qualifier (Help)
Violations query:
SELECT ?item ?file { ?item wdt:P443 ?value . MINUS { ?item p:P443 [ pq:P407 [] ] } MINUS { ?item rdf:type [] } BIND(replace(wikibase:decodeUri(SUBSTR(STR(?value), 52)),"_"," ") AS ?file) FILTER ( !REGEX( ?file, "^LL-.+$") ) FILTER ( !REGEX( ?file, "^[A-Z][a-z]-.+$") ) } LIMIT 100
List of this constraint violations: Database reports/Complex constraint violations/P443#Qualifier P407 missing
add qualifier (simple) (Help)
Violations query:
SELECT ?item ?file { ?item wdt:P443 ?value . MINUS { ?item p:P443 [ pq:P407 [] ] } MINUS { ?item rdf:type [] } BIND(replace(wikibase:decodeUri(SUBSTR(STR(?value), 52)),"_"," ") AS ?file) FILTER ( REGEX( ?file, "^[A-Z][a-z]-.+$") ) } LIMIT 100
List of this constraint violations: Database reports/Complex constraint violations/P443#Qualifier P407 missing (filename starts with code)
add qualifier (easy) (Help)
Violations query:
SELECT ?item ?itemLabel ?file ?lang { ?item wdt:P443 ?value . MINUS { ?item p:P443 [ pq:P407 [] ] } MINUS { ?item rdf:type [] } BIND(replace(wikibase:decodeUri(SUBSTR(STR(?value), 52)),"_"," ") AS ?file) FILTER ( REGEX( ?file, "^LL-.+$") ) BIND( replace( ?file, "^LL-(Q\\d+) .+$", "$1") as ?lang) } LIMIT 100
List of this constraint violations: Database reports/Complex constraint violations/P443#Qualifier P407 missing (LL filename)
if applicable, add P443 to an item (e.g. place name pronunciation audio to an item for the place). Based on Property_talk:P7084#Use for pronunciation file categories (Commons) (Help)
Violations query:
SELECT ?item ?comcat WHERE { ?item p:P7084 [ ps:P7084 ?r ; pq:P642 wd:Q184377 ] MINUS { ?a p:P443/pq:P407 ?item . FILTER NOT EXISTS { ?a wikibase:lemma [] } FILTER NOT EXISTS { ?a rdf:type <http://www.w3.org/ns/lemon/ontolex#Form> } } [] schema:about ?r ; schema:isPartOf <https://commons.wikimedia.org/> ; schema:name ?com . BIND(CONCAT("[[c:", ?com, "\u007C", strafter(?com, ":"), "]]" ) as ?comcat) } ORDER BY ?comcat LIMIT 500
List of this constraint violations: Database reports/Complex constraint violations/P443#Languages with categories available at Commons, but no item with audio file
This property is being used by:
Please notify projects that use this property before big changes (renaming, deletion, merge with another property, etc.) |
Lists
editDiscussion
editDeprecate for Q-items?
editShouldn't we start deprecating this property for Q-items and move them to Lexeme items under forms? To me it confuses the notion of concept and the notion of representation. The concept might also have multiple representation, e.g., dictionary (Q23622) may be "wordbook" and "dictionary" as English representations, — and these are definitely pronounced differently. — Finn Årup Nielsen (fnielsen) (talk) 12:35, 24 November 2019 (UTC)
- I don't think the property works well on items for concepts that have a common noun as label (e.g. Q23622 you mentioned), but it's useful on items like Museum of Fine Arts of Lyon (Q511) Q511#P443. --- Jura 03:45, 26 November 2019 (UTC)
- No. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:59, 26 November 2019 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:59, 26 November 2019 (UTC)
- There's a related discussion at Wikidata:Administrators'_noticeboard#User:Lingua_Libre_Bot where Wostr suggests that we should expect this to be used on Q-items as a qualifier rather than as a top-level claim. This seems consistent with the discussion at the property proposal. Bovlb (talk) 15:41, 26 November 2019 (UTC)
- I don't know where it was discussed that this property may be used on items as a regular property. I see only that this should be used as a qualifier. And that should be the way this property is used: Lexeme + qualifier on items. Wostr (talk) 00:43, 3 December 2019 (UTC)
- I Support, we should deprecate it.--So9q (talk) 10:19, 26 November 2020 (UTC)
- I Support removing it's usage for items starting by changing the constraints. ChristianKl ❪✉❫ 19:30, 22 December 2020 (UTC)
- What alternative is suggested? --- Jura 07:05, 23 December 2020 (UTC)
- I agree with Jura's comment from 03:45, 26 November 2019 (UTC). For entities with proper names (people, places, companies, etc.), there is usually (but not always) only one "official" pronunciation (i.e. in the native language), and creating a lexeme for each just to put a link to the media file there seems to be overkill. --Matěj Suchánek (talk) 10:26, 23 December 2020 (UTC)
- If someone does text analysis or text-to-speech, having lexemes for proper names can be helpful. ChristianKl ❪✉❫ 17:09, 23 December 2020 (UTC)
- If we are going to do this, we would need to revise the notability criteria for lexemes, which currently forbid adding lexemes for most proper nouns. For that reason, I think I agree with Jura's proposal, even though I can't see any way to make a technical constraint out of it. Vahurzpu (talk) 19:42, 23 December 2020 (UTC)
- If someone does text analysis or text-to-speech, having lexemes for proper names can be helpful. ChristianKl ❪✉❫ 17:09, 23 December 2020 (UTC)
- I agree with Jura/Matěj/Vahurzpu - some kind of constraint in principle that we should only use this for "proper nouns" would be good. Pronunciation can differ for names - I am pretty confident that Colin Powell (Q150851) and Colin Powell (Q5145485) would have different pronunciations, for example, despite having the same family name (P734)/given name (P735) claims and identical item labels. I can't see how a lexeme-based solution would work there. So Oppose deprecation in general, but no objection to cleaning up its use on generic items like dictionary (Q23622). Andrew Gray (talk) 21:43, 23 December 2020 (UTC)
- I'd like to see some more detail as to what it would entail in practice. Lexemes are intended for Wiktionary's use cases, but I fear we might end up with a level of complexity that's over-the-top for most Wikipedians (and Wikisourcerors, Wikivoyagers etc.) to really grasp—they, after all, have to buy into the point of Wikidata for the project to flourish. The idea that, say, we're going to have a Q object called "Bill Clinton" (the president), then a Lexeme object called "Bill Clinton" that contains a link to the audio file of the pronunciation of the word, seems needlessly convoluted. Or perhaps "Bill Clinton" (the president) could then link to a lexeme object or "Bill" and "Clinton", with the latter shared with, say, Hillary Clinton, and George Clinton (the funk musician), and the many other Clintons in the world. That might be preferable after all. The issues presented by Andrew Gray seem worth answering, and it'd be worth spelling out how this works in practice so your average workaday Wikidata user can grasp what exactly they have to do in order to, say, manage the Wikidata object for the subject of a Wikipedia article they wrote without needing a linguistics degree. —Tom Morris (talk) 09:15, 24 December 2020 (UTC)
- The three options that have been presented so far (correct this comment if it's wrong)
- The status quo:
- dictionary (Q23622)pronunciation audio (P443)LL-Q150 (fra)-Fabricio Cardenas (Culex)-dictionnaire.wav
language of work or name (P407)French (Q150) - Nicolaus Copernicus (Q619)pronunciation audio (P443)Pl-Mikołaj Kopernik.ogg
language of work or name (P407)Polish (Q809)
- dictionary (Q23622)pronunciation audio (P443)LL-Q150 (fra)-Fabricio Cardenas (Culex)-dictionnaire.wav
- Jura's proposal:
- (L17419 is dictionnaire (L17419), and is intrinsically linked to French (Q150))
- L17419-F1pronunciation audio (P443)LL-Q150 (fra)-Fabricio Cardenas (Culex)-dictionnaire.wav
- L17419-S1item for this sense (P5137)dictionary (Q23622)
- Nicolaus Copernicus (Q619)pronunciation audio (P443)Pl-Mikołaj Kopernik.ogg
language of work or name (P407)Polish (Q809)
- ChristianKl's proposal, which might be the same as fnielsen's proposal, though I'm not sure:
- L17419-F1pronunciation audio (P443)LL-Q150 (fra)-Fabricio Cardenas (Culex)-dictionnaire.wav
- L17419-S1item for this sense (P5137)dictionary (Q23622)
- Create a new lexeme for Copernicus's name in Polish
- <new lexeme>-F1pronunciation audio (P443)Pl-Mikołaj Kopernik.ogg
- <new lexeme>-S1item for this sense (P5137)Nicolaus Copernicus (Q619)
- The status quo:
- Two additional considerations that I noticed in the course of writing this:
- Introducing a lexeme would break the easily visible link from item to pronunciation audio, as there's no property that links from items to their corresponding lexemes in general (there is subject lexeme (P6254), but this seems restricted to only items about specific words). item for this sense (P5137) makes sure that it's still possible to query for these links, but I don't think infoboxes would be able to play nicely with anything we convert to lexemes.
- Using lexemes for names would allow us to record different forms of names. This doesn't really come up in English much, but in Latin, "Brutus is a man" translates to "Brutus est vir", while "Caesar saw Brutus" translates to "Caesar vidit Brutum". In Latin, this transformation is usually pretty straightforward, but there very well might be a language where different lexical forms of names are different enough that one would need multiple pronunciation examples. – The preceding unsigned comment was added by Vahurzpu (talk • contribs).
- The three options that have been presented so far (correct this comment if it's wrong)
- Thanks for the overview @Vahurzpu:. As I just wrote in the lexicographic telegram channel: "[In reply to Jan Ainali] Yes, but as is noted in the discussion the links from the Q-namespace to the L-namespace are weak or nonexistent at the moment. We need to fix that in some way so the ordinary editor can easily discover lexeme(s) related to a qitem".
- This is a UI problem and a how to link most intelligently to avoid redundancy and maximize utility-problem. I Support Jura's proposal for now because ChristianKl's proposal will cause a lot of bloat proper name lexemes which does not seem valuable to me. Jura's proposal may require more work to keep tidy in case it cannot be adequately constrained. In that case we could have a bot help us, so that will sort itself someway or another.--So9q (talk) 11:37, 25 December 2020 (UTC)
- What about to constraint use on items to use as qualifier for native label (P1705)/name in native language (P1559)? That's my 4th proposal. It makes possible to use on any type of items (but if it has appropriate monolingual property). --Infovarius (talk) 00:03, 27 December 2020 (UTC)
- Support I support this 4th proposal which makes most logical sense to me. Ainali (talk) 08:21, 30 December 2020 (UTC)
- Support Wostr (talk) 13:05, 30 December 2020 (UTC)
- Support--So9q (talk) 07:19, 31 December 2020 (UTC)
- I don't think we should add a constraint yet (it would create nearly 40,000 constraint violations and the property is almost certainly being used for things we haven't thought about so far), but I do think it makes sense to migrate pronunciation audio (P443) for names of people and places to qualifiers of native label (P1705)/name in native language (P1559). I think it would also make sense to move statements corresponding to common nouns (like the ones on apple (Q89)) to lexemes. - Nikki (talk) 11:32, 1 January 2021 (UTC)
- First example of proposing change: [1]. And addition to possible properties: female form of label (P2521)/male form of label (P3321). But I have some trouble in choosing, subject lexeme (P6254) or lexeme sense (P7018)? --Infovarius (talk) 23:54, 31 December 2020 (UTC)
- For me, the reason to link to lexemes is to access forms, so I think subject lexeme (P6254) is sufficient. I don't think lexeme sense (P7018) (which confusingly links to senses not lexemes) should be required because lots of lexemes are missing senses and it's not even possible to search for senses, so it would make the edits unnecessarily difficult to do. - Nikki (talk) 11:47, 1 January 2021 (UTC)
- It has been over a year and a half and this discussion led to nothing. Right now user:Lingua Libre Bot added a bunch of statements which really shouldn't be there like physics (Q413) or frequency (Q11652). Wostr (talk) 09:27, 6 August 2022 (UTC)
- It looks like it finally stopped adding statements to items in January 2023.
- I've been slowly moving statements (either to qualifiers of monolingual text statements or to lexemes), but I can't do it all by myself. We need a more organised effort if we are ever going to significantly reduce the number of statements on items.
- I've made a page at Property talk:P443/To fix with the number of statements on items by language, with links to queries.
- - Nikki (talk) 08:57, 31 December 2023 (UTC)
Problem with "language of work"-constraint when used on lexeme forms
editThe constraint is triggered on lexemes also and leads to confusion. I suggest we create a new property for use on lexeme forms if this cannot be fixed easily.--So9q (talk) 10:25, 26 November 2020 (UTC)
- @So9q, Lockal: This issue has been fixed. --Dhx1 (talk) 11:57, 26 November 2020 (UTC)
- I've removed the constraint because two thirds of the uses of this property are on forms, which means most statements produce constraint errors and a constraint which most statements don't meet is not a sensible constraint. I've created phab:T269724 asking for a way to restrict the constraint to items. - Nikki (talk) 11:06, 28 December 2020 (UTC)