Property talk:P496
Documentation
identifier for a person
List of violations of this constraint: Database reports/Constraint violations/P496#single best value, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P496#Unique value, SPARQL (every item), SPARQL (by value)
\d{4}-\d{4}-\d{4}-\d{3}[\dX]
”: value must be formatted using this pattern (PCRE syntax). (Help)List of violations of this constraint: Database reports/Constraint violations/P496#Type Q5, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P496#Item P106, search, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P496#Scope, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P496#Entity types
ORCID launched its registry services and started issuing user identifiers on 16 October 2012. ORCID identifiers for people, who died before 2012 are usually locked or invalid. (Help)
Violations query:
SELECT REDUCED ?item { ?item wdt:P496 [] . ?item wdt:P570 ?ddate . hint:Prior hint:rangeSafe true . FILTER(?ddate < "2012-01-01T00:00:00Z"^^xsd:dateTime) }
List of this constraint violations: Database reports/Complex constraint violations/P496#People, who died before launching ORCID
Pattern ^https?:\/\/(www\.)?orcid\.org\/(0000-000(1-[5-9]|2-[0-9]|3-[0-4])\d\d\d-\d\d\d[\dX])/?$ will be automatically replaced to \2. Testing: TODO list |
Pattern ^(.*)[xХхΧχ]$ will be automatically replaced to \1X. Testing: TODO list |
This property is being used by:
Please notify projects that use this property before big changes (renaming, deletion, merge with another property, etc.) |
|
Missing link to archive?
editThe property is missing the link to the archive where the proposal is stored. I can't find it in the archive either. --Tobias1984 (talk) 11:29, 8 May 2013 (UTC)
- Oops, I had removed it from proposal but forgot to readd it. --Zolo (talk) 08:38, 15 May 2013 (UTC)
Possible constraint refinement
editAccording to [1] "Initially ORCID iDs will be assigned between 0000-0001-5000-0007 and 0000-0003-5000-0001". The latter might perhaps read "0000-0003-4999-9997" for the last included ORCID iD. And the harder part will be to formulate a complementary constraint for Property:P213 (ISNI). -- Gymel (talk) 16:14, 4 June 2013 (UTC)
ORCID scheduled outage; 15 December 2018
editPlease see: Wikidata:Project chat#ORCID scheduled outage. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:51, 14 December 2018 (UTC)
Items that violate this distinct values constraint?
edit@Daniel Mietchen: @Pigsonthewing: How can I find and merge them? I'm finding that some of the batches that I've created are generating duplicate items for a given ORCID. How might I easily find these so I can remedy the duplication? Thanks. Trilotat (talk) 23:50, 19 January 2019 (UTC)
- @Trilotat: Watch out for the "Distinct values" template above and the SPARQL queries linked from it. I've adapted one of them to give
- Try it!
#Unique value constraint report for P496: report listing each item SELECT DISTINCT ?item1 ?item1Label ?item2 ?item2Label ?value { ?item1 wdt:P496 ?value . ?item2 wdt:P496 ?value . FILTER( ?item1 != ?item2 && STR( ?item1 ) < STR( ?item2 ) ) . SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } . } LIMIT 10000
- This currently lists 1430 cases. --Daniel Mietchen (talk) 03:17, 20 January 2019 (UTC)
- @Daniel Mietchen: Thank you, Daniel... Should I bother to merge these (using QS) or should I ignore and let some bot do it (if there is such a bot)? Trilotat (talk) 03:23, 20 January 2019 (UTC)
- There is no bot for this. I merge them from time to time, but every so often the list includes false positives. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:44, 20 January 2019 (UTC)
- hi @Vojtěch_Dostál: I merged Q112343840 that you created "new person from NKC (with ORCID)" in June 2022 to Jaromir Gumulec (Q57020340) that has existed since 2018. Could you please match your contributions to WD before making new items? Cheers! Vladimir Alexiev (talk) 09:43, 5 July 2022 (UTC)
- @Vladimir Alexiev I'll run a check on that. Thanks for reporting it. Vojtěch Dostál (talk) 10:03, 5 July 2022 (UTC)
- @Vladimir Alexiev I merged about 130 pairs of items. Hopefully the National Library got all the ORCIDs right and I did not merge different people. Vojtěch Dostál (talk) 16:56, 5 July 2022 (UTC)
- hi @Vojtěch_Dostál: I merged Q112343840 that you created "new person from NKC (with ORCID)" in June 2022 to Jaromir Gumulec (Q57020340) that has existed since 2018. Could you please match your contributions to WD before making new items? Cheers! Vladimir Alexiev (talk) 09:43, 5 July 2022 (UTC)
- There is no bot for this. I merge them from time to time, but every so often the list includes false positives. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:44, 20 January 2019 (UTC)
- @Daniel Mietchen: Thank you, Daniel... Should I bother to merge these (using QS) or should I ignore and let some bot do it (if there is such a bot)? Trilotat (talk) 03:23, 20 January 2019 (UTC)
Occupation as requirement
editWhy? does researcher mean anything?? What is the point of a requirement that only invites garbage!! Thanks, GerardM (talk) 08:08, 6 October 2020 (UTC)
- @GerardM: Completely agree with what you are saying. Not convinced it is sensible to inference that someone is a researcher just because they have an ORCID. What do we actually gain from adding these claims? Simon Cobb (User:Sic19 ; talk page) 08:13, 25 October 2020 (UTC)
- @Luckyz: as someone who has done that about which Gerard and Simon are complaining (@IagoQnsi: as the person who added the constraint). Mahir256 (talk) 03:53, 17 November 2020 (UTC)
- I don't agree with the constraint, nevertheless I suggest to support the presence of an occupation in these biographies because this is, for most of them, the only way to understand their nature, especially if there's not birthdate, nor place of birth or any additional biographical information. What we gain from this info is to understand who we're talking about with general datatypes properties. Luckyz (talk) 10:55, 17 November 2020 (UTC)
- @Luckyz: as someone who has done that about which Gerard and Simon are complaining (@IagoQnsi: as the person who added the constraint). Mahir256 (talk) 03:53, 17 November 2020 (UTC)
I agree with everything said here; I support removing the constraint. --IagoQnsi (talk) 13:43, 17 November 2020 (UTC)
Label
editHi GerardM! I saw that you reverted my edit changing the label of this property from "ORCID iD" to "ORCID". I think "ORCID" is a better name because "ID" is redundant ("ORCID" stands for "Open Researcher and Contributor ID"), and it's generally referred to as simply "ORCID". For example, see ORCID iD (Q51044) and all of its sitelinks. Even if "ID" were to be included, it's silly to lowercase the first letter because it's styled as "ORCiD". Tol (talk | contribs) @ 05:33, 20 January 2022 (UTC)
- Hoi, the label ORCID ID has been used a gazillion number of times. There are many labels that are remarkably similar. You introduce loads errors with this insistence. Thanks, GerardM (talk) 09:04, 20 January 2022 (UTC)
- What do you mean that it "has been used a gazillion number of times"? I think just "ORCID" is more common. By saying that "there are many labels that are remarkably similar", do you mean aliases on this property? Labels should still be the most common name. How does this introduce errors? Tol (talk | contribs) @ 20:40, 20 January 2022 (UTC)
- See the brand guidelines for the ORCID name. This should be kept as "ORCID iD". Mwtoews (talk) 23:43, 20 January 2022 (UTC)
- @Mwtoews: as far as I know, common names are preferred over a stylised trademark. For example, English Wikipedia (which is often the basis of policies on other wikis) says not to follow trademarks that begin with a lowercase letter, and to use common names. Help:Label also says to use common names. Tol (talk | contribs) @ 00:19, 21 January 2022 (UTC)
- See the brand guidelines for the ORCID name. This should be kept as "ORCID iD". Mwtoews (talk) 23:43, 20 January 2022 (UTC)
- What do you mean that it "has been used a gazillion number of times"? I think just "ORCID" is more common. By saying that "there are many labels that are remarkably similar", do you mean aliases on this property? Labels should still be the most common name. How does this introduce errors? Tol (talk | contribs) @ 20:40, 20 January 2022 (UTC)
investigate ORCID discrepancies
editNotified participants of WikiProject Source Metadata
- https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P496#single_best_value lists 1886 violations
- https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P496#%22Unique_value%22_violations lists 196 violations
- https://www.wikidata.org/wiki/Wikidata:Bot_requests#c-Vojt%C4%9Bch_Dost%C3%A1l-20220929161900-Vladimir_Alexiev-20220927170900 checks cases where ORCID is used as a reference on "author" but the author has a different ORCID
- https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P496#%22Format%22_violations lists just 1 but there are more violations in references, eg The Heart's Little Brain: Shedding New Light and CLARITY on the "Black Box" (Q111260013) uses an ORCID format violation: "David J Paterson"
These are due to a variety of reasons. Just for the "single value" violations I found the following reasons:
- WD conflation
- ORCID needs merge
- ORCID deprecation (redirect) that is not reflected in WD
Any takers to do some data cleaning? -- Vladimir Alexiev (talk) 06:54, 11 October 2022 (UTC)
Parsing the ORCID dump
editFor those wishing to parse the ORCID dump, I made an example XSLT which allows to quickly extract the Twitter URLs. XSLT isn't the most loved language but if you a bit of XML and XPath, and you don't need a ton of parsing logic, it might be easier to adapt this rather than write your own script or parser. It's also very easy to increase the parallelism with xsltproc (if your machine can handle the I/O load). Nemo 21:13, 16 November 2022 (UTC)
Regex should be updated
editThe current validation regex is:
0000-000(1-[5-9]|2-[0-9]|3-[0-4])\d\d\d-\d\d\d[\dX]
But now ORCID could contain numbers also in the first group of values (e.g. 0008-1234-5678-9012) Sette-quattro (talk) 19:44, 12 April 2023 (UTC)
- New range is: 0009-0000-0000-0000 and 0009-0010-0000-0000
- >
- > Fast forward to 2023. The ORCID registry has grown, now with over 16 million records, which means we required a new block of identifiers. This new range reserved for ORCID by ISNI is defined between 0009-0000-0000-0000 and 0009-0010-0000-0000.
- >
- See: https://support.orcid.org/hc/en-us/articles/360006973973-What-is-the-relationship-between-ISNI-and-ORCID 82.135.219.67 15:02, 23 February 2024 (UTC)