How important are game reviews? For many of us, they are vitally important, so important that we won't even truly consider buying a game until one of our favorite sites has reviewed a game and given us some indication as to whether it is worth our hard earned cash. For others, they could care less what sort of score a game gets; they are going to buy it regardless. If you are one of the people who ignore review scores, then this is not the blog post for you.
But, for those of us who pay close attention to review scores, you may find this post interesting. Whenever I look through the posts of many people I track as far as contributions, there is, quite often, some debate as to whether a game was scored too high, whether it was judged too harshly, or whether a particular site was doing a truly good job at reviewing games. Whether a games score is too high or too low is a difficulty issue to address. Games have such a subjective element to them and beauty, or beautiful game play, is often in the eye of the beholder. Sure, there are many games out there which, unless you have a possible mental disorder or are hopelessly strung out on drugs, almost everyone is going to say as one, "Now that's a great game!" But, for the most part, games are subject to a wide variety of opinions by those playing them and by those reviewing them. What I really wanted to do was to look at the more popular review sites and try and come up with a way to determine which one is doing the best job at reviewing games fairly and, perhaps more importantly, in a consistent manner. For my research here, I used Metacritic (the best known score aggregator out there and considered by many of us as the Bible of game scores), Gamespot, IGN, 1UP, Gamepro (Arguably the four most well known general gaming sites that also review games and provide scores) and X-Play (the best known video game show on TV today). With my panel of contestants in place, I now needed to find someway to compare them to see which one is doing the best job at reviewing games. But how could I do that?
I took a rather scientific approach to this. I recognized that all good experiments have a control, or a constant, that they look at when trying to determine certain things. There may be variables all over the place, but the control remains constant. What I decided to look for were two games I could compare with each other. I wanted them to be good games, but not great games, so there would be room for error and differing opinions to compare. I wanted the two games to be somewhat similar in game play and I especially wanted two games that perhaps shared a similar flaw or two. The idea behind this was that if a review site was doing its job properly, then the two games should have the same score, or at least have a low difference between the two scores. If the scores were wildly apart from each other, I felt this would indicate that perhaps the review site was not reviewing games as consistently or accurately as they should be. Makes sense right? At least on paper it sounds good. But, what two games would be proper to use for such a comparison.
I thought back to a couple of games I played within the past year or so and I found them. Both of these games were good games, not great games by any stretch of the imagination. They both had some great moments in them and were both what I would call action-RPGs. But, they both suffered from some serious game play issues and both had almost the exact same problems with the camera and how that flaw really resulted in some cheap deaths and making each game far less than they both could have been. The games I am talking about? Star Wars: The Force Unleashed (TFU) and Too Human.
If you played either of these games for any decent amount of time, I think you will agree with me on this point. Both games have solid stories to them, with TFU, arguably, having the better story because, well, it's Star Wars and Darth Vader, and an untold story between Revenge of the Sith and A New Hope. Too Human's story was a bit deeper, but it delved into Norse mythology a bit too much for the average gamer. Too Human was a better loot fest than TFU and was certainly more RPG based than TFU was, in my opinion. But, both games mapped much of what you were capable of, as far as action was concerned, to the right thumb stick and neither game performed as well as it should have in this regard. Both games also had major issues with the camera. Half the time, you could not tell who you were fighting, where they were, or exactly what you were supposed to be doing. For those who played these games, am I right? I remember thinking, after going through both games, how two games with so much potential could be so flawed by the same design flaw. In my opinion then, anyone reviewing the games should have seen the similar problem with the games. Whether you were a Star Wars fan or not (and I am a very big fan of the Star Wars Universe) or whether you thought Norse Gods are the greatest thing ever, both games should have scored pretty darn close to each other. If they did not, then we would have to question what happened in the review process.
As for me, and utilizing how I think Metacritic would transfer my review score into a Metascore, I would give Star Wars a 69 (6.9) and I would give Too Human a 67 (6.7). Both games would score in the "yellow" range on Metacritic's color chart, which is rather consistent with what both games actually did score on Metacritic. On my own website (I don't want to give out the address yet as I am still working on many aspects of the site and, for those who actually track me and pay attention, I would not want anyone to be disappointed) for games scoring between a 6.0 and a 6.9, here is how I describe those games:
6 – 6.9: These are games that can really be considered average at best. While these games are all decent titles, and many of them may have more than their fair share of fun, exciting, and enjoyable moments,if a game scores in this range, you can rest assured that your gaming dollar might possibly be better spent elsewhere.
That's a pretty good description of both games, although both are in the higher range of that general range. By contrast, I describe games in the 7.0 – 7.9 range as follows:
7 – 7.9: These are all good, solid, titles, but they may not be for everyone. These games may also give you that sort of "been there done that" feeling. Games scoring in this range certainly get our seal of approval, but there are a few problems with the game that results in it getting dinged down into this category.
So, TFU and Too Human fall somewhere in between the two and I feel that is a pretty accurate place for both of them. So, by my own scoring system being morphed into a likely Metascore, the difference between scores is a 2. Not bad and I was rather pleased to see that I myself am being pretty consistent with my own methods for scoring games. But what of the other sites I mentioned previously? How are they doing? Let's find out. Please note that for the sake of consistency and because Too Human was a 360 exclusive, I am using the 360 version of The Force Unleashed for this comparison.
First, let's look at Metacritic.
Many people love it; many others hate it. Unlike other sites, Metacritic does not independently review games. It aggregates scores from a wide variety of sources and, using its own formula, comes up with a Metascore that gives gamers an idea as to whether a game is great, good, average, or lousy. While many of us, myself included, love Metacritic, many others say their scoring system is flawed and is not a good representation of a games true value. Well, using our two games, let's see how close Metacritic comes. Metacritic gave The Force Unleashed a 73 and it gave Too Human a 65, resulting in a difference between scores of 8. That's not too bad. Given the wide range of review scores many of us have seen, I am thinking that if the scores of the two games are within 10 points of each other (using Metascores across the board) then the reviewers are doing their job pretty well. So, is Metacritic an accurate gauge of how good a game is? Based on this experiment, I would have to say it does the job pretty well. So well, in fact, that for the remainder of this discussion, I am going to use a games Metascore as a means of comparing reviews for the other reviewers we are reviewing (yeah, say that 10 times real fast). By doing this, we are adding an additional level of consistency to this experiment. Onward then.
Let's move on to Gamespot, where many of you will be reading this, and the site at which I have been a member longer than any of the other ones.
Gamespot gave TFU a 75, but gave Too Human a 55, a difference in scores of 20. That is a pretty wide discrepancy. To its credit Gamespot correctly described TFU. When it works, it works well. When it doesn't, it's not much fun. But, Gamespot seemed to give the game a much better score than Too Human based on its lineage as a Star Wars game. I can accept that a decent Star Wars game should score better than a brand new franchise. But, a 20 point difference is a bit much. Gamespot noted the fun elements of Too Human, but noted the problem with ranged combat (which is also quite present in TFU) and seemed to really ding Too Human for the story. That's fine, it deserved to be hit. But TFU should have been hit equally hard for pretty much the same problems. I think the 20 point difference has to be of some concern for a gamer that is looking to Gamespot as a source for a solid review. I really like Gamespot and I have a lot of respect for the reviewers and the staff. But, I was a bit surprised to discover this much of a difference.
Let's look now at IGN, another great all around gaming and entertainment site.
IGN gave TFU a 73 and it gave Too Human a 78, a difference in score of only 5. Now that's more like it. You have to give the guys at IGN some real credit because they called TFU for what it was, a good action game that allowed a gamer to control The Force like no previous Star Wars game and wrapped it around a great story. But, they also recognized all too well some real level design flaws and the repetitive nature of many of the battles. Story was not enough to save TFU in IGN's eyes. As for Too Human, IGN recognized the aspect of the game that Too Human is quite a bit better at than TFU, the RPG elements. The skill tree makes a big difference in Too Human, perhaps too much so as your character tends to become too powerful and this result in the systematic and somewhat unchallenging beat down of anything that dares oppose you. While I disagree with IGN that Too Human was the better game, the small difference in scores tells me that gamers can be rather confident that the IGN reviewers are getting it right.
Let's now move to 1UP, another excellent all around review site.
1UP did not like either of these games as it gave TFU a very low score of 50 and it gave Too Human an even lower score of 42, a difference in score of 8. In my mind, that is pretty consistent. Remember, we are not looking at whether someone scored a game too high or too low; we are looking for consistency in reviewing two games that are very similar in their strengths and weaknesses. 1UP was clearly disappointed with how the use of The Force was actually implemented and was equally disappointed that TFU did not, in any way, live up to its potential. I think that may be a bit harsh, but 1UP is certainly not the only one's who felt that way. As for Too Human, 1UP was even harsher. But, did they ever hit the nail on the head with their criticism of the problems with the right analog stick and that the game relies on it far too much. While arguably scoring both games a bit too low, you have to look at 1UP and acknowledge the fact that they were very consistent in finding both games to be rather bad. As such, gamers should feel relatively confident in the accuracy of 1UP's reviewers.
The last web based (primarilly)reviewer I am going to look at is Gamepro, yet another excellent gaming site( and magazine).
Gamepro gave TFU a 70, while it gave Too Human a somewhat surprising 80, a difference in scores of 10. In my mind, that puts Gamepro somewhat on the outside looking in as far as reliability and consistency are concerned. It gets confusing why TFU got only a 70, even though the reviewer at Gamepro noted it was a fun game, despite its flaws, but Too Human got an 80, primarily due to the co-op aspect of the game. I agree with Gamepro that co-op is something TFU does not have, and Too Human deserves some bonus points for that. But, with all the other problems that Too Human has, wouldn't a score of 75 have made more sense? Again, we go back to the fact that you can't notice the fundamental game play flaws inherent in both games and find one game to be 10 points better than the other without those of us who are in the know raising an eyebrow and going, "Hmmmm…" The 10 point difference is not offensive, but you have to look at it and put Gamepro a notch below IGN and 1UP, but a notch above Gamespot.
The final reviewer I am going to review is G4's X-Play.
Now, look, I really enjoy X-Play. I think it is one of the more informative gaming shows out there and some of the hands on demos they get a hold of really whet my appetite for some games. I have a lot of respect for Adam Sessler and I think he is very knowledgeable. Morgan Webb is too MMO grounded and I think this taints a lot of her reviews. With that caveat being said, however, X-Play gave TFU a 40 (2 stars out of 5), but it gave Too Human an 80 (or 4 stars out of 5). For those keeping score, that's a difference in scores of a whopping 40 points. That's simply wrong. X-Play demo'd both games leading up to their release, but they were clearly more smitten with Too Human. X-Play's score for TFU is the lowest recorded score on Metacritic and it makes absolutely no sense. They note the great story. They note the very good character development and the very good graphics. Then they rip the game apart based on game play and bugs in the game. OK. Fair enough. But, in scoring Too Human as high as they did, they ignore the exact same problems in game play for which they eviscerated TFU. If they had given TFU 3 out of 5, which likely would have translated into a 60, I could live with that. But, a 40 point difference is just not acceptable and gamers should probably be careful if they are relying solely on X-Play to tell them whether or not a game is good. X-Play claims they give brutally honest reviews and, more often than not, I agree with them. Here, however, they are just brutally incorrect.
So, there you have it. Excluding my own review (which I would not blame any of you for discounting and saying that I was just trying to show how smart I was by coming up with my own scores with a difference of only 2), we have IGN as the most accurate reviewer, followed up with a tie between Metacritic and 1UP. Below them are Gamepro, followed by Gamespot, with X-Play bringing up the rear. I am sure this experiment I have done will be ripped to shreds by many people (I brace for the comments from any of the staff here at GS whom I may have offended....just making an honest observation guys and gals)and I am fine with that. It was never an exact science, which is why it's called opinion and why I have limited the post to the "opinion" category. I do hope, however, that for those of you who take reviews seriously, you have found this entertaining, informative and, perhaps, a bit enlightening.