Help:Wikipedia: The Missing Manual/Building a stronger encyclopedia/Categorizing articles

From Infogalactic: the planetary knowledge core
Jump to: navigation, search
Wikipedia: The Missing Manual (Discuss)

When you look at the bottom of a Wikipedia article, you see category links. For example, the article Coat of arms of Copenhagen has the category links "1661 establishments", "Culture in Copenhagen", "History of Copenhagen" and "Danish coats of arms". Category links are a big help for readers looking for articles related to a topic. Those links are there because editors like you added them. Wikipedia's software doesn't do automatic categorization, and Wikipedia employs no professional categorizers.

Adding categories to articles is easy: Just type a few words, add brackets, and save. The trick is figuring out what category links would provide maximum usefulness to readers, and that's what this chapter shows you. It also explains the other half of the categorization picture—the category pages where category links are listed. You can create and improve upon those pages, too.

Fundamentals of categorization

When you click one of the category links at the bottom of articles and other Wikipedia pages, you go to a category page. For example, in the article Zoltan Acs, if you click Category:American economists, you go the page shown in Figure 17-1.

Figure 17-1. When you go to the page Category:American economists, you see links to 722 articles categorized as being about American economists, plus three subcategories. Every category page looks like this one, with four parts: some introductory text, possibly with a link or two; a list of subcategories; a list of any pages that belong directly to the category rather than one of its subcategories; and finally the categories to which the category page belongs.

Categories aren't limited to articles. Portal pages, Wikipedia instructional pages, and some user pages, for example, also have categories. Even category pages themselves have categories; in fact, that's a critical part of how the category system works at Wikipedia. (More on that a bit later.)

Category pages are the payoff for category links at the bottom of articles—they help readers find other articles about the same topic as the one they were reading. Without category pages, there'd be little point in adding category links to articles. Without category links in articles, category pages wouldn't have any content other than some introductory text.

Category links in articles

Wikipedia has plenty of category pages, like the one shown in Figure 17-1. These pages become more useful when editors keep adding relevant articles to them. You do that not by editing a category page, but by editing the articles themselves.

Adding categories to articles

You can add a category link to the bottom of an article in two ways: by typing the category link into the article's wikitext (in the edit box), or by adding a template to the wikitext. You do the first for the topical categories that interest readers, such as Category:Australian astronomers. The second way (with a template) is usually for marking an article as needing cleanup (improvement) work, for example, Category:Articles lacking sources from September 2007.

The basic category link

Adding a topical category to an article is easy. To add the category Toddler sports to an article, for example, open that article for editing, and type, near the bottom (more on that in a moment), the text [[Category:Toddler sports]]. Then do the standard stuff: Add an edit summary, preview your edit, and save the change. The text you typed is exactly how the link appears at the bottom of the article.

There's one twist, however: When you preview, you won't see the new category above the edit toolbar, where the rest of the article is displayed. Nor will you see any existing categories for the article above the toolbar. Rather, you have to scroll down, past the edit box, past the insertable wikicode, and past the template list, to the very bottom of the window—that's where you'll see the categories displayed.

Note:
Where categories show in preview mode is just an oddity of Wikipedia. Bug 10244, which reported this issue in June 2007, doesn't seem to be a priority for the developers, so don't hold your breath that it'll change.

New category links always should be added near the bottom of an article's wikicode. You normally edit (add, delete, change) category links by going to the last section of an article and clicking the "edit" link. If you find any category links anywhere other than in the last section of an article, move them.

Note:
The only wikitext below the links for categories should be stub templates (used for short articles) Stub templates look like this: {{text including "stub"}}.
Note:
Interlanguage links also used to be located after category links. Interlanguage links looked like this: [[fr:Parapsychologie]]. The latter is a link to the article in the French Wikipedia on the topic of parapsychology, and used to be near the bottom of the wikitext of the English Wikipedia article Parapsychology. But as of summer 2014, these links are maintained in Wikidata and don't appear in the text of the articles

Category links from templates, for maintenance and stubbing

You usually create cleanup or maintenance category links with a template. For example, adding the template Script error: No such module "Template link general". to an article creates an article message box wherever in the article you placed the template, plus it adds the category Articles lacking sources from January 2008 to the bottom of the article. To see the many such templates that you can add to articles, go to Category:Wikipedia maintenance templates.

Stub templates—templates added to very short articles—aren't maintenance templates per se, but they're similar. They similarly change the contents of an article as well as adding one or more categories at the bottom of the article. Consider, for example, the article Hulihee Palace, which includes the template {{Hawaii-struct-stub}} (see Figure 17-2).

Figure 17-2. These templates are in the article Hulihee Palace. They're visible only when you click "edit this page," and then scroll to the bottom of the page.

The template {{Hawaii-struct-stub}} generates two categories Western United States building and structure stubs and Hawaii stubs. Those two categories may be of some use to readers, but their main purpose is to signal to editors that this article needs to be expanded.

Fixing category links that come from templates

Occasionally, when you want to change a category shown at the bottom of an article, you can't find that category link in the article's wikitext, even if it's a topical category that normally wouldn't come from a template. If you can't find the category link in the article's wikitext, it must be coming from a template. Finding that template, and perhaps changing it (for example, to delete the category from the article), is a five-step process:

1. Click the "edit this page" tab.

Don't click a section edit link; click the tab at the top of the article.

2. Scroll down to the bottom of the page, where you'll see the lists of templates used for the page.

Figure 17-3 is an example.

3. Check a template (open it in a separate window or tab), starting either at the top or with the most likely suspect.

You'll arrive at a page with the prefix Template:, like Template:Registered Historic Places.

4. Click "edit this page" to see the wikitext of the template; see if the category in question is there. If it isn't, try the next template.

When you do find the category, it will look just like an article wikilink: [[Category:Whatever]].
Note:
If the template is protected from editing, you can still see the wikicode by clicking the "view source" tab. But that's generally not worth doing—protected templates are almost always those used in a large number of articles, and those kind of templates don't include article topic categories.

5. Usually, the problem is that the category is relevant to the template, but not to articles that the template is in. In that case, put <noinclude> in front of the category link and </noinclude> just after it.

This code tells the software that when the template is put into an article page, the category link should not be.

When you change a category as described in these steps, the change can take days to show up in the article itself. The reason for the delay is that a given template can be located in thousands of articles. Trying to update all categories immediately could severely affect Wikipedia's servers. So the changes go into a job queue, and the software handles them little by little. Be patient.

Effective categorization

Every article should have a category, even if it's the Uncategorized pages category. That's one reason to add a stub template to an article—it puts it into one or more categories. In fact, articles usually have multiple categories, at least one of which should be the relevant category subject. If you see an article with only one category, you usually can improve things by adding more categories.

A category's relevance should be immediately apparent. For example, if an article has the category Russian architects, but there's no indication that the subject of the article is either Russian or an architect, then either the category is wrong or the article needs to be expanded.

Note:
If you find a category that isn't supported by text in the article, check the history tab for vandalism. If you don't see any, then you can either remove the category, or add a template to point out the problem. If you remove the category, make sure your edit summary clearly says what category you're removing, and why. If you want to mark it as a problem, add the {{Category unsourced}} template immediately after the category link in the wikitext, and make sure your edit summary indicates why you're adding the template.

Don't add both a category and its subcategory. For example, Golden Gate Bridge is in Category:Suspension bridges, so it should not also be in Category:Bridges. (If you think you've got a valid exception, check at the guideline Wikipedia:Categorization and subcategories—shortcut: WP:SUBCAT—to see if you're right.)

Finding the right category

The real challenge is to find the right categories to add to an article. Some suggestions:

  • Go as low as possible in the hierarchy of categories. The more specific, the better. To this end, putting an article into a huge category—like People—is better than nothing, but not by very much, as Figure 17-3 shows. It's unlikely that a reader is going to say "Ah, this article is about a person. I'm really interested in reading about people—let's see what other articles Wikipedia has that are also about people."
Figure 17-3. The category People includes a number of subcategories (not shown) plus six individual articles. Two of these are not like the other four—Altun Bishik and Kailash Singh Parihar. Both should be in a subcategory.
  • If you can think of a similar article (for example, for an article about a ballet star who else is or was a similar star?), check that article to see what category links appear there.
  • If you can't think of a similar article, use a wikilink in the lead paragraph to get you to one. For example, if the first sentence of an article begins "Penalty methods are a certain class of algorithms to solve constraint optimization problems," you could click the link to the article Algorithm or to the article Optimization (mathematics). Once there, look at the links in the article, links to the article (at the far left, click "What links here"), and categories of the article. All of these could lead you to similar articles; the category links might even be useful as is.
  • Be thorough. Think of all the different things a topic may be associated with—geographic area, a historical period, an academic subfield, a certain type of thing (like a food or an ornament), or a special interest topic. The more, the better. Just remember there needs to be some supporting text in the article. For example, the article Terracotta has links to the following category pages: Shades of red, Ceramic materials, Sculpture materials, Pottery, and Terracotta. That last category covers the whole topic of Terracotta—as the most significant article on the topic, the Terracotta article is at the top of the category, but there are many other pages in the category as well.
The CategoryTree special page

One of the best ways to find a good category for an article is to start at a higher level category and see what subcategories are available. The easiest way to do that is with the CategoryTree Special page (Figure 17-4). This special page is a window into Wikipedia's category system, starting wherever you want to.

To get to this page, look in the "toolbox" box on the left side of the screen. Click the "Special pages" link, and click CategoryTree. Then enter a category (for example, Australia), and click "Show Tree." You can either pick a subcategory that's visible, or click a "[+]" indicator to see subcategories under any particular line.

Figure 17-4. The great thing about the CategoryTree Special page is that you don't have to open a bunch of category pages to find good subcategories. You can simply click one here, find (or not find) something you want to add to an article, and then try another category, drilling down as needed—all in one place.
  • If you're not sure about a category, either be bold and add it, or use a higher level category that you're sure about. In either case, add the template {{checkcategory}} to the article. The template puts the article into Category:Better category needed so that other editors can see a review is needed.
  • If you want to add a category to an article, and the category doesn't exist yet, don't just add the category link (which will show up as a red link). Instead, go through the process of creating (or at least considering the creating of) a new category, as described in the section about creating a category, one step of which is to search further.
A link to a category page that doesn't exist is subject to being summarily deleted, though it may be a while before someone notices. A link to a new category page that has exactly one article is likely to lead to the category page being deleted quickly, and then the link to the now non-existent category page being deleted from the article as well. Don't create an impromptu, temporary, spur-of-the-moment, best-guess-and-I'll-fix-it-later category; you'll just be wasting your time and the time of other editors.
  • If you don't know what category to add to an article, don't worry about it. Instead of guessing at a category, use the {{uncategorized}} template to bring the article to the attention of others. Editors who love to categorize articles will find a category for it.
Add the date parameter to the {{uncategorized}} template—for example, {{Uncategorized|date=January 2008}}. (Be careful of the parameter spelling; parameters are never capitalized. If you type Date instead of date, the software will ignore the parameter altogether.) If you don't add this parameter, a bot will do so, adding yet one more edit to Wikipedia and the history of the article. That could obscure other edits, such as vandalism, so don't skip the parameter.

Getting articles into the right place on a category page

Suppose you've added a good category to an article page, and you then look at the category page to make sure the new category is there. You find it but in the wrong place: The article is about Jane Doe, and it's in the category page's "J" section, not the "D" section where it belongs. The article is in the wrong place (as far as you're concerned) because the Wikipedia software doesn't understand anything about names of people.

You can fix this problem in one of two ways:

  • You can specify, in a category link, where the title of an article is to be listed within a category page. You do this in a way that looks very much like a piped link: [[Category:Female crime victims | Doe, Jane]]. (The spaces around the "|" symbol are optional.)
  • You can add something that specifies the sorting information ("sort order") for all the categories on the page, rather than doing that link by link. In this example, you just place the following directly above the listed categories in the wikitext: {{DEFAULTSORT:Doe, Jane}}.
Note:
The word "DEFAULTSORT" is called a magic word, because it does unusual things. One way that magic words can be identified is that they use all capital letters. (For the full list, see the page Help:Magic words—shortcut, H:MW.) And technically, though it's surrounded by double curly brackets, it's not a template.

Neither of these methods of changing where an article appears on a category page changes the name of the article. The article's name will still be Jane Doe, but it will now be listed in the "D" section, not the "J" section, of a category page. In the best of worlds, all the editors who add a category to an article will add the same sort order (in this case, lastname, then firstname). If they don't, you should feel free to fix the (hopefully few) category links that don't.

Category pages

As you work on adding good categories to articles, you'll encounter category pages like the one in Figure 17-1, and as a reader, you'll find them useful for getting to articles you're interested in. But there's more to category pages. They're created and managed by editors, and, like all other pages at Wikipedia, there are times when you, as an editor, can edit them to improve them. This section explains what you need to know.

Hierarchy: The categorizing of category pages

Every category page should have at least one parent—a higher-level category. (The exception, of course, is the category page at the very top of the hierarchy.) Or, to put it differently, every category page but the very highest (shown in Figure 17-5) should be within a subcategory of at least one higher-level category.

Figure 17-5. The highest category in Wikipedia—the only category that doesn't belong to a higher category—is Category:Contents. It has thirteen subcategories.

Figure 17-6 shows a category page that itself is assigned five categories.

Figure 17-6. The category page Algerian media has five parent categories—one is a cleanup category; the other four are higher level topical categories. Put differently, Algerian media is a subcategory of five categories, four of them topical and one a cleanup category.

Changing the categories assigned to a category page

Did you spot the error with the five parent categories in Figure 17-6? The problem is that Algerian culture, if you check, also has the parent category Algeria. A category page shouldn't have two parents where one parent (Algerian culture, in this case) is itself a subcategory of the other parent (Algeria, in this case). One parent category (Algerian culture) is enough, since someone looking at the higher level category (Algeria) can always find the page by just drilling down.

Fixing this categorization error is simple: Open the page Category:Algerian media for editing (Figure 17-7), find the line with [[Category:Algeria|Media]], and delete that. Then, following the standard procedure, add an edit summary, do a preview (categories show up at the very bottom of the page), and save the change.

Figure 17-7. On the category page Algerian media, the line with [[Category:Algeria|Media]] needs to be deleted, since [[Category:Algerian culture]] already leads to that. While you're looking at the wikitext, note that three of the four categories listed have a sort order, which affects where this category page is displayed on the higher level category page (under "A" or "M"). Finally, notice the two interlanguage links—German and French—which create links in the left margin to similar category pages at the German and French versions of Wikipedia.
Note:
You can find category pages without parents (category pages not themselves in a subcategory) by looking at the page Wikipedia:Database reports/Uncategorized categories. This page is a report that's not constantly updated, so if you're not looking at a fresh report, you'll probably find that a lot of the listed pages have either been fixed or proposed for deletion. If you find categorization interesting, you may want to help out here: Every category page but one (Category:Contents) should have a parent category.

Renaming, merging, or deleting a category page

Changing a category assigned to an article or a category assigned to a category page is easy—just a quick edit. By contrast, if you want to rename a category page, you need to go through a longer process; you can't just click the "move" tab, because there is no move tab.

Renaming, merging, and deletion of pages in the Category namespace is discussed at Wikipedia:Categories for discussion (shortcut: WP:CFD). See Figure 17-8.

Figure 17-8. The page Wikipedia:Categories for discussion is for discussion of renaming, merging, or deleting of all types of categories except for two, discussed elsewhere: user categories (as in Category:Wikipedians who dislike excessive categorization) and stubs (categories for very short articles). There's a separate page for discussion of user categories probably because they can be particularly controversial, or trivial. The separate page for discussion of stub categories is because this is a very specialized area.

If you think this type of action is needed, follow the instructions on the WP:CFD page. Note that there are sections for non-controversial actions: "speedy renaming" (after a wait of 48 hours) and "speedy deletion" (for example, because a category is what Wikipedia calls "patent nonsense," defined as something "unsalvageably incoherent").

Creating a new category

If the category doesn't exist, you can create one. Whether you should create a new category, however, is another matter. While Wikipedia clearly still needs a lot more articles, it's not clear that it needs a lot more categories for those articles. So, here are some questions to consider before you create a new page:

  • Will the new category have more than a few pages on it? The more pages you can put that fit the category, the more likely the category will survive. A category with just one article belonging to it is likely to have a short life.
  • Is the category being added to pages that already have adequate categories assigned to them? As the guideline Wikipedia:Overcategorization (shortcut: WP:OC) says, "not every verifiable fact (or the intersection of two or more such facts) in an article requires an associated category. For lengthy articles, this could potentially result in hundreds of categories, most of which aren't particularly relevant." (The essay "Do not write articles using categories"—shortcut WP:DNWAUC—is also informative on this matter.)
  • Is the category name neutral and factual? A category like Shiftless no-good politicians who should be recalled from office is hopelessly non-neutral (see WP:NPOV), not to mention unverifiable (see WP:V).
  • If a category seems obvious, did you thoroughly look for it under a different name? Particularly when you're adding a category to a new or stubby article, don't just assume that if Category:People from New Zealand doesn't exist, and if it fits the article, you should create it. (See the section about finding categories for more detail on finding good categories for articles.)

If you decide to create a new category, you do that by creating a category page, exactly as you would any other page. Just type the name (for example, Category:Some new article category that Wikipedia needs) into the search box, and click Go. When the software tells you no such page exists, click the "Create the page" link, type some introductory information about the category, and save the page.

Note:
Alternatively, you can first add a not-yet-created category to an article, which creates a red link. When you click the red link, you'll see the "Create this page" option. Click it, and then follow the rest of the procedure in the prior paragraph.

There's one clear exception to the general rule that you should always be hesitant to create a new category: when a lot of articles (say, over a hundred) are in a non-cleanup category that has no subcategories, or are in a category when no applicable subcategories for them exist. In such a case, creating additional subcategories, and moving articles out of the category and into a subcategory by editing article pages, is a good thing to do.

Building out categories

Suppose you've found a good category for an article you're working on, and when you get to the category page, you're surprised that there aren't many more articles listed there. Consider your surprise an opportunity to improve Wikipedia. You can take up the challenge to make the category page a much better place for readers to go.

WikiProject members do this type of work all the time. They look for articles encompassed by their project, and then add WikiProject templates to article talk pages and categories to articles. You don't have to be a member of a WikiProject, however—just someone who realizes how useful categories are to both readers and editors.

There are significant advantages to working from a category page outward, looking for articles to add. Use existing articles to get clues to similar topics, both by reading the article text and following internal and external links in the articles. If you're fairly knowledgeable on the topic of a category, so much the better; you've probably got a lot of good ideas about articles to look for.

Discussing categories

Figure 17-9. Want to mention a category within a comment you're making on a discussion page? If so, either add a colon just before the word Category, or use the {{cl}} template. Both choices display the category name where you typed it, within your comment, and neither will put the discussion page they're on into a category.

Here's a puzzle that new editors often encounter: How do you mention a category on a discussion page? If you type [[Category:WhatevertheCategoryis]], you've just put the discussion page into that category, and the category name itself won't show up where you typed it. It'll show up at the bottom of the page, which isn't very helpful. Wikipedia has two different ways to display the category link where you want it: Add a colon just before the word Category, or use the {{cl}} template, as shown in Figure 17-9.

If you're looking at a category page for articles, and find talk or user or other pages listed there as well, it's worth taking a quick look at these other pages to see if you can fix the problem with a judicious insertion of a colon. Sometimes you shouldn't (for example, a draft article in user space probably isn't hurting anything if it's being worked on and already has a category), and sometimes you can't easily insert a colon (if the category is embedded in a template, for example). But often the mislisted page is just a simple mistake by an editor that can be easily fixed.

Categories, lists, and navigation templates

Figure 17-10. If this list of economists looks familiar, it's because it's also Figure 14-6 (see the section about formatting list articles), and because Figure 17-1 earlier in this chapter, showing the category page Economists, has almost the same set of links in it. Lists and categories can overlap considerably, but each has strengths and weaknesses.

Categories are not the only way to provide readers with an organized approach that ties a group of articles together. Lists and navigation templates both can do the same thing. For example, Figure 17-10 is a list that essentially does the same thing as the category shown in Figure 17-1 (see the section about list articles).

Figure 17-11. Shown is the navigation template titled Nobel Memorial Prize in Economics: List of Laureates. This navigation template appears in an article when the template {{Nobel Prize in Economics}} is added to the article's wikitext. Navigation templates are appropriate for relatively short lists (the one shown has about sixty links). Navigation templates are also appropriate only when membership in a list is very clear. Prominent British politicians, for example, would not be good for a navigation template, since "prominence" is on a continuum. Even if everyone agreed on relative prominence, the cut-off point for being in or not in the navigation template is still arbitrary.

You've probably seen the third option—navigation templates—but you may not be familiar with the label. Figure 17-11 shows a navigation template for winners of the Nobel Memorial Prize in Economics.

The guideline Wikipedia:Categories, lists, and navigation templates (shortcut: WP:CLN) discusses the advantages and disadvantages of each of these three ways of providing navigation among articles. It also notes that "When developers of these redundant systems compete against each other in a destructive manner, such as by nominating the work of their competitors to be deleted simply because they overlap, they are doing Wikipedia a disservice." In short, there's room on Wikipedia for editors to offer readers multiple ways to get around; no single approach is as good as a mix of all three; and editors who believe their approach is superior should direct their efforts at improving what they favor, to offer something even better to readers, rather than trying to convince other editors that their approach is inferior.

Note:
Search engines do notice category pages. In Google, for example, the page Category:Economists was the fourth result of a search on "economists" (with results limited to pages on the English Wikipedia). The article List of Economists was the second result. (Economist was first, Economics was third).

If categorizing articles intrigues you, one place to start is the page Category:Underpopulated categories, a category for categories that contain at least one page, and where an editor felt that more pages were needed. Another place is Category:Category needs checking.

Also consider joining WikiProject Categories (shortcut: WP:CATP). And more editors are always welcome at Wikipedia:Categories for discussion (shortcut: WP:CFD).