Inter-Rater Reliability Of Third-Party Coin Grading Services

To participate in the forum you must log in or register.

Author	Replies: 8 / Views: 2,413
dd27 Pillar of the Community United States 666 Posts	Posted 07/20/2016 1:20 pm An important topic in professional psychology (my day job ;-) is inter-rater reliability. If different experts provide very similar ratings, then the rating process, or method, has high inter-reliability. Consequently, the ratings are dependable--you can count on them as not being so subjective as to become meaningless. For more info, and so I don't go too far afield, see this site for more info on inter-rater reliability: http://www.socialresearchmethods.ne...reltypes.php So, I wondered if inter-rater reliability studies have been conducted regarding coin grading in general, and among the TPG services specifically. My search of this forum and some Google searches found two studies: 1) A study conducted by a coin collector, who might have been a member of this forum, StuJoe, who at one time had a couple of websites, e.g., TheStuJoeCollection.com, which are no longer functional. The results of his study are posted at: http://www.rickbassett.com/pace/dis...0results.htm His results show significant variation amongst the grades assigned by his participants, i.e., low inter-rater reliability. 2) A computer science PhD student (now a professor), Rick Bassett, conducted research for a dissertation, Machine Assisted Grading of Rare Collectibles through the COINS framework, which I found quite interesting, since he approached the topic in a meticulous, scientific manner, and because he constructed a fairly accurate machine-graded system (opticals, software, etc.). His dissertation is available online: http://www.richardbassett.com/pace/...on%204.0.pdf Detailed info about his research process is available too: http://www.rickbassett.com/pace/dissertation/ In terms of the inter-rater reliability of experienced ("expert") coin graders, the results revealed a very wide range of grades assigned to digital photographs of various Lincoln cents. In statistical terms, there was a high standard deviation, which also means the inter-rater reliability was low. The results are on page 90-91 of his dissertation. These results support what many of you have said on this forum: * Do not become over-focused on TPG numbers to the exclusion of your own assessment of a coin's grade, appearance, and desirability. * Understand that different 'experts' will assign different grades. * Judgments regarding the 'best' TPG often involve factors other than actual accuracy, such as how the TPG markets itself and social conformity [ https://en.wikipedia.org/wiki/Conformity ]. * 'Machine-grading', using a combination of computer analysis and human judgment, is probably the wave of the future, and will no doubt be marketed as "even better", for which TPGs will most likely charge a premium. QUESTIONS a) Do you know of any other research studies, particularly comparing the TPGs? b) What other conclusions would you draw from the available research? c) To what extent do you believe the TPGs would welcome a rigorous, unbiased, independent, audited study of the inter-rater reliability of their grading compared to themselves and other TPGs? (My guess is that it's the last thing they would want to see as it might put them behind the .) Thanks! Mark Edited by dd27 07/20/2016 6:53 pm Report this Post to the Staff

BStrauss3 Pillar of the Community United States 4610 Posts	Posted 07/20/2016 6:02 pm Question c - none, zero, nada, no chance as it would challenge their very business model -----Burton 50+ year / Life / Emeritus ANA member (joined 12/1/1973) Life member: Numismatics International, CONECA Member: TNA, FtWCC, NETCC, EveryCountry (online) coin club Owned by three cats and a wife of 40+ years (joined 1983) Author: 3rd Edition of the Sample Slabs book, https://www.sampleslabs.info/ Report this Post to the Staff
dd27 Pillar of the Community United States 666 Posts	Posted 07/20/2016 6:48 pm "none, zero, nada, no chance" - perfect. Edited by dd27 07/20/2016 6:48 pm Report this Post to the Staff
Coinfrog Bedrock of the Community United States 94367 Posts	Posted 07/20/2016 7:35 pm Yeah, baby. Next question, please. Edited by Coinfrog 07/20/2016 7:36 pm Report this Post to the Staff
Conder101 Bedrock of the Community United States 17884 Posts	Posted 07/21/2016 11:35 am Question A, only one conducted years ago by Coin World when they sent the same group of coins to each of the TPG for a comparison of grades. It had some interesting results but the sample size was too small and it should have been repeated multiple times. A single pass for a given coin through each service only give you a single data point and does not detect problems with consistency or changing standards over time. (By the time a coin gets through all the top five companies, six at the time, a year or more may have passed. Would the same coin get the same grades on a second run through? What about a third, or fourth?) A proper study could be set up and done but it would be expensive and time consuming. I once figured it could be done for somewhere between $100 and $200K (not including the cost of the coins). It would probably be higher today. Report this Post to the Staff
Jaobler Pillar of the Community United States 6406 Posts	Posted 07/21/2016 3:40 pm I remember that Coin World study, in which the same group of coins was sent to a TPG service, then cracked out of the slabs and sent to the next TPG on the list. They covered all the majors at the time: PCGS, NGC, ANACS, ICG, PCI, SEGS, Accugrade (I think), and maybe one other. The coins were all US coins (both circulated and unc), covering many denominations and design types. Most were low-value pieces. The data seemed to rank PCGS and SEGS as the two services that were the most conservative and the most consistent. That result influenced me to look more kindly on SEGS coins when making buying decisions. Later attempts to cross SEGS coins to PCGS were not very successful as PCGS seemed to grade the ex-SEGS coins more harshly. I'd enjoy seeing a similar study performed, as long as someone else pays for it! Report this Post to the Staff
Tryna Pillar of the Community United States 937 Posts	Posted 07/26/2016 12:12 pm computer grading waws tried before. At the time the computers were not up to it and the public was not ready for grades like MS 63.3141. Now the computers could most likely do it easily and with the acceptance of stars and pluses and stickers I think the collecting public would eat it up. I do not look for PCGS or NGC to start it out, however a new company could start it. If they could pull enough business away from the big guys they would have to follow. When this happens the new lowball will be P 0.001. and there will registry sets of AG 3.14 and G 6.66. Start saving your money, boys a whole new numbers game will ensue! Report this Post to the Staff
Conder101 Bedrock of the Community United States 17884 Posts	Posted 07/26/2016 1:09 pm I think the computers were up to it back in 1991, and they were not using decimal point grading (even though the computers probably did internally) just the whole number grades. At least PCGS wasn't. Compugrade did use one decimal point. I think the only way a new company could get away with it would be if the grades cam back at least with the same whole number grades that PCGS or NGC assigned. The problem would be that your "objectively" grading computer would be trying to consistently hit the subjective grades assigned by the big two. You can program your computer as precisely as you want and even get it to give the same grades consistently as on a sample set from the big two, but as soon as it is given a coin from one of the big TPG's where the grader just liked it and so gave it an upgrade, you program give it a lower grade and is therefor "wrong". That was probably one of the reasons PCGS gave up the computer grading. It always gave the precise answer its programming called for, but the human graders didn't. So sometimes they would agree withthe computer and sometimes they didn't. Report this Post to the Staff
Tryna Pillar of the Community United States 937 Posts	Posted 07/26/2016 2:10 pm I think computer gradeing is a sure bet. in 1960 a computer was a guy doing long division with pencil and paper in 1970 a computer was something that guided the man on moon rockets in 1980 a computer was a Commmador 64 or an IBM 8088 in 1990 a computer was a 286 now everything from A to Z runs on a computer and 4 year old kids are using toys and games with more computing power than the entire appollo mission control (yes, I know there were computers long before 1960. My wife worked with John Kemeny for crying out loud) The point is two guys sitting in a dim room with a light on their left grading coins will soon be a thing of the past. Almost everyone whines about 'gradeflation' and "How did this get in a 66 holder?" and" Why do they overgrade toned coins?" Collecters of the near future will demand more accurate computer grades for their coins. Report this Post to the Staff
	Replies: 8 / Views: 2,413

To participate in the forum you must log in or register.

All Forums Category: General Numismatic Forums Forum: Third Party Coin Grading (TPG): PCGS, NGC, ANACS, ICG, ETC.

Silver Gold Bull US is the United States leading bullion dealer. We deliver gold coins, gold bars, silver coins, silver bars and precious metals to your door. Order today!

The First Dollar, your premier source for bullion and rare coins. See our bullion list on the site and the coin inventory on ebay.

View Last 100 New Topics

View Last 100 Active Topics

Disclaimer: While a tremendous amount of effort goes into ensuring the accuracy of the information contained in this site, Coin Community assumes no liability for errors. Copyright 2005 - 2026 Coin Community Family- all rights reserved worldwide. Use of any images or content on this website without prior written permission of Coin Community or the original lender is strictly prohibited.
Contact Us | Advertise Here | Privacy Policy / Terms of Use

Coin Community Forum

It took 0.36 seconds to rattle this change.