KO - Knowledge Organization
- Birger Hjørland
This article presents the field of knowledge organization (KO) and its core perspectives: knowledge organization processes (KOPs) and knowledge organization systems (KOS). In provides a brief overview of research traditions, approaches and basic theoretical issues in the field (practicalist and intuitivist approaches, consensus-based approaches, facet-analytic approaches, user-based and cognitive approaches, domain- analytic/epistemological approaches, bibliometric approaches, and IR approaches, among others). The article also briefly presents KO on different technological platforms (physical libraries, archives, museums, classical bibliographical databases and the Internet). The article argues that KO as a part of library and information science can be considered a narrow sense, but that the broader sense of KO is needed to provide the necessary knowledge for the narrow sense.
Sommaire
- 1 1. Introduction
- 2 2. Research traditions, approaches and basic theoretical issues in KO
- 3 3. KO on different technological platforms
- 4 4. Other names and other fields
- 5 5. Conclusion
- 6 Acknowledgement
- 7 References
1. Introduction[modifier]
Knowledge Organization (KO) is a field of research, teaching and practice, which is mostly affiliated with → library and information science (LIS). KO is first and foremost institutionalized in professorships at universities around the world, in teaching and research programs at research institutions and schools of higher education, in scholarly journals (for example, Knowledge Organization, 1993-), in national and international conferences, and in national and international organizations (for example, the International Society for Knowledge Organization, ISKO, cf., Dahlberg 2010).
KO is about describing, representing, filing and organizing documents and document representations as well as subjects and concepts both by humans and by computer programs (cf., Hjørland 2008). For these purposes, rules and standards are developed, including classification systems, lists of subject headings, → thesauri and other forms of metadata. The organization of knowledge into classification systems and concept systems are core subjects in KO. The two main aspects of KO are (1) knowledge organization processes (KOP) and (2) → knowledge organization systems (KOS). Knowledge organization processes (KOP) are, for example, the processes of cataloging, subject analysis, → indexing, → tagging and → classification by humans or computers. Knowledge organization systems (KOS) are the selection of concepts with an indication of selected semantic relations. Examples are classification systems, lists of subject headings, thesauri, ontologies and other systems of metadata.
1.1. History of knowledge organization[modifier]
Among the different aspects of the history of KO are the following:
- (1) History of library classification systems
Covers library classifications from ancient times until today. The main work about this topic is Samurin 1964. The history will be covered in this encyclopedia by two articles: I. Introduction and premodern classification covering the period from ancient times until the rise of modern library classifications and classification theory in the last quarter of the 19th century. II. Modern and postmodern library classifications. In the 20th Century the most important theoretical development was the development of the facet-analytical theory — according to the dominant understanding in the field. It is based on Aristotelean logic and discussed by Hjørland 2013b). The Dewey Decimal Classification has become the dominant system internationally but is often criticized for lack of theory and in important ways suboptimal compared to other systems. Library classifications today also face competition from systems developed in other contexts (cf. Martínez-Ávila 2016).
- (2) History of the classifications of the sciences
This aspect is part of the field because library classifications are often based on classification of the sciences (even if there are also phenomena classifications which are not). The main work about this topic is Kedrow 1975, and Peirce belongs to the contributors. A related problem is the organization of encyclopedias. Today bibliometric mapping is a dominating method in the study of relations between research fields, but also the development of research classifications for administrative purposes like the Frascati Manual (OECD 2015) and domain-analytic studies (e.g., Wallerstein et al. 1996) are important.
- (3) History of scientific taxonomies (classification in the sciences)
This includes the systematic botanists such as Konrad Gesner and Carl Linné, as well as developers of the periodical system of chemical elements such as Mendeleev and Meyer.
- (4) History of the theory of classification and concept theory
There is no known main work covering this field. Important contributors include Aristoteles, Darwin, Wittgenstein, Rosch, Kuhn and many others (see Hjørland 2017, → section 4). Definition and combination of concepts were also studied in projects of → ideal languages by such authors as Llull, Bisterfeld, Dalgarno, Wilkins and Leibniz. Recent debates concerning numerical, evolutionary and cladistic approaches are also related.
- (5) History of knowledge organizing systems and processes
There is no known main work on this topic, but Keyser 2012 is among the many texts. The history of indexing and alphabetization belongs here.
- (6) History of knowledge organization as a discipline
The development of KO as a discipline for research and teaching is mainly tied to the development of library and information science as university discipline (or professional school discipline), that is after 1850. People like Charles A. Cutter, W. C. Berwick Sayers and Ernest Cushing Richardson established the field of "knowledge organization" as an academic field around 1900. Henry Bliss's book (1929) The organization of knowledge and the system of the sciences also represents one of the main intellectual contributions establishing the field. These authors argued that book classification should be based on knowledge organization as it appears in science and in scholarship learning. Two important events in the development of KO as an organized field of study, both led by Ingetraut Dahlberg, also were the creation of the journal International Classification (1974) (from 1993 renamed Knowledge Organization) and the establishing of the International Society for Knowledge Organization (ISKO) in 1989, of which the journal became an official organ. To describe the history of the field has difficulties. For example, the thesaurus today is clearly a part of the discipline, but it was originally external to the discipline. Also, what to consider important contributions depends on the metaperspective from which the field is considered.
2. Research traditions, approaches and basic theoretical issues in KO[modifier]
Traditionally, approaches to KO are divided into human based approaches versus machine-based approaches (cf. Anderson and Pérez-Carballo 2001a, b). There are, however, many different kinds of human approaches and many different kinds of computer-based approaches, and they are not necessarily always distinct. For example, human based approaches may be very mechanical, if humans just follow simple rules that they have learned, such as an alphabetical arrangement, or finding the best matches for book titles in a given KOS. Both humans and machines may or may not base their classification on citations, but if they both do, they are applying a similar approach. Hjørland (2011b), therefore, argued that this traditional distinction is theoretically unfruitful. Alternatively, it has been suggested that human indexers as well as programmers are guided by their knowledge/theories, which — at the deepest level — is connected to their (often-implicit) theories of knowledge. However, it is often difficult to reveal what kind of theoretical assumptions guide the KOPs. Such processes are often done intuitively and some systems have been difficult to relate to a theory. However, the following eight traditions, in and outside of KO, are probably the most influential and the most important today.
A: Approaches developed inside of KO[modifier]
2.1 Practicalist and intuitivist approaches[modifier]
These are approaches, which make a priority of practical matters, such as using the same classification system for several libraries, and thereby, facilitating centralization of classification and indexing. From this perspective, KO should be balanced between, on the one hand, adequate and updated subject knowledge and, on the other hand, the need for stability, in order to avoid a reclassification. The model here is the Dewey Decimal Classification system (DDC, first edition constructed by Melvil Dewey in 1876), which today is the dominant library system worldwide. (Practicalism, as described in this section, should not be confused with pragmatism, which has a deep intellectual foundation and commitment and one that is important in the domain-analytic approach, as described in 2.5 below).
Another example is the journal classification in the citation databases: "The Institute of Scientific Information (ISI) itself provides a classification of journals at the level of the database that has been based on intuitive criteria" (Pudovkin & Garfield 2002; here cited from Leydesdorff 2006, 602). In other words, no kind of research based criteria was used, just the intuition of the classifiers.
2.2 Consensus based approaches[modifier]
Henry E. Bliss (1929, 1933) found that library classification should be based on what he referred to as the scientific and educational consensus. "Topics should be collocated and placed in classes not according to the whim of the person who devises the classification system, but according to the standards set by scientists and educators" (Drobnicki 1996, 3). It was characteristic that (a) Bliss consulted the scholarly literature and (b) he believed that one is able to detect an underlying pattern of agreement. Eugene Garfield has described Henry Bliss as "a true scholar. His goals and aspirations were different from those of Melvil Dewey, whom he certainly surpassed in intellectual ability, but by whom he was dwarfed in organizational ability and drive" (Garfield 1974, 291). Bliss's view of consensus probably reflected the positivism or modernism of his time. He wrote (1933, 37):
- The more definite the concepts, the relations, and the principles of science, philosophy, and education become, the clearer and more stable the order of the sciences and studies in relation to learning and to life; and so the scientific and educational consensus becomes more dominant and more permanent.
Kruk (1999, 137) is among the critics of this view and wrote: "In the twentieth century knowledge is not perceived as a solid structure any more. The universal library is a utopian vision and it belongs to the same category as the universal encyclopedia and the universal language". Today, Bliss's view is contrasted by a view of knowledge that is much more concerned with conflicting interests and perspectives (cf. the domain analytic view, 2.5). His engagement with literature that has to be classified is, however, still an important principle.
Bliss's reception may reveal something about the hostility that serious academic work may encounter in a librarianship dominated by practicalism:
- Bliss had announced his intention to develop a new general classification in the Library Quarterly in 1910. The announcement met with bitter hostility, not from Melvil Dewey (Bliss always said that his personal relations with Dewey were cordial [...]) but from some of Dewey's disciples. Bliss gradually became a rather solitary figure in the American library scene, and his later work met with apathy" (Campbell 1976, 139) Further: "Bliss's first book, The Organization of knowledge and the system of the sciences, was published in 1929 by Henry Bolt & Co., New York, after he had failed to interest the American Library Association in it. Only three of Bliss's papers were ever published by the Association, and two of those were condensed [...]. The American Library Association, after negotiations lasting several years, refused to publish his second book without a generous subsidy from the author sufficient to cover all publishing costs (Campbell 1976, 139)
Fortunately, this hostility did not hinder Bliss's recognition: "The two books [...] and the outline version of his scheme, A System of Bibliographic Classification (1935, 2nd ed., 1936) won him a reputation in many parts of the world as an original thinker of great power, and a classificationist who was not afraid to tread out new paths" (Campbell 1976, 139).
2.3 Facet-analytic approaches[modifier]
The facet-analytic paradigm is probably the most distinct approach to knowledge organization that has been developed within LIS. It is mainly attributed to S. R. Ranganathan and the British Classification Research Group, but it is fundamentally based on the principles of → logical division developed more than two millennia ago (Mills 2004). Faceted systems differ from enumerative systems by not listing all of their classes, but provide building blocks from which specific classes for each document may be formed. This approach still has a strong position in the field and it is the most explicit and "pure" theoretical approach to knowledge organization (KO). The strength of this approach is its logical principles and the way it provides structures in knowledge organization systems (KOS). The main weaknesses of this approach are (1) a lack of an empirical basis in its methodology (although, of course, any given facetted classification must have a basis in some empirically derived list of concepts) and (2) a speculative ordering of knowledge without a basis in the development or an influence of theories and socio-historical studies. It seems to be based on the problematical assumption that relations between concepts are a priori and are therefore not established by the development of models, theories and laws (see further in Hjørland 2013b).
2.4 User-based and cognitive approaches[modifier]
A distinction should be made between user-friendly KOS and user-based KOS. Today, it seems to be evident that KOS should be user-friendly, but this was not always the case (see Hjørland 2013c and Jensen 1973). It is not evident, however, that user-friendly systems should be produced on information collected from users or about users. Extremely successful systems such as Apple's iPhone, Dialog's search system and Google's PageRank, for example, are not based on the empirical studies of users. Actually, the idea that KOS should be based on user studies (rather than, for example, on literary warrant, → logical division, word statistics or scholarly theories) seems to be an unsupported hypothesis. Nonetheless, it is a family of approaches that has its supporters (for further information see Hjørland 2013c).
2.5 Domain-analytic/epistemological approaches[modifier]
A core principle of the domain-analytic approach is: "The starting point for understanding classification is one that any object, any document and any domain could be classified from multiple equal correct perspectives." (Mai 2011, 723). In other words:
- Different communities may be interested in the same object (e.g. a stone in the field [or a given book]) but may interpret it differently (e.g. from an archeological or geological point of view). What is informative (and thus information) depends on the point of view of the specific community. (Hjørland 2002, 116)
In contrast to consensus based approaches (2.2 above), domain analysis assumes the existence of multiple perspectives. Disagreement is common and "the picture is really not one of agreement, but of conflicting schools, and the closer the neighbours the sharper the conflict" (Broadfield 1946, 69). Of course, the degree of consensus is stronger in some domains when compared to others. Recently, a revolution has taken place in ornithology and it seems as if the new classification of birds has a very strong scientific basis and a high degree of consensus (see Fjeldså 2013). To examine the warrant for a classification is, of course, part of the domain-analytic framework. It is also important to realize that not every perspective or classification is as important as any other. One should not subscribe to relativism due to convenience, i.e., abstain from considering the strengths and the weaknesses of different perspectives or paradigms.
Ingetraut Dahlberg has expressed the view that KO is part of the metasciences:
- I consider Knowledge Organization as a subdiscipline of Science of Science with application fields not only in the Information Sciences but also for all subject fields (domains) needing Taxonomies (classification systems of objects) and other fields like Statistics, Commodities, Utilities, Weapons, Patents, Museology etc.
- According to Science Theory, every domain has its own area of objects and of methods and processes, next to other relationships. (Dahlberg cited from Dodebei 2014).
Hjørland (2011b) also claims the importance of the theory of knowledge for indexing and for information retrieval. Today, medical doctors often rely on systematic reviews that are based on the paradigm termed evidence based medicine (EBM, or interdisciplinary: evidence based practice, EBP). By implication, indexing and retrieval have to adapt to the criteria for what counts as knowledge in this paradigm. The same is, of course, the case in other fields and in the case of conflicting paradigms. In general, criteria for organizing knowledge are to be found in the subject fields, their theories and their paradigms. It is therefore important with Dahlberg to consider KO as a science of science.
From the domain-analytic perspective, the term KO better reflects the connection to the metasciences than does the term information organization, IO. KO points to the related fields of history, philosophy and the sociology of knowledge (among other fields). This is one argument considering KO the preferred term (see further in Hjørland 2012b).
A model of a domain-analytic study is Ørom (2003) who identified different "paradigms" in the art studies and compared them with major library classification systems.
B: Approaches developed outside of KO (but representing competing approaches, which are necessary to consider)[modifier]
2.6 Bibliometric approaches[modifier]
Bibliometrics (with altmetrics, informetrics, scientometrics and webometrics) is an interdisciplinary field with strong affiliations to LIS. This field developed techniques for producing bibliometric maps based on co-citation analysis, bibliographic coupling, or by direct citation. Such maps may serve information retrieval and are a form of competing or a supplementary approach to knowledge organization, although the fields of KO and bibliometrics have so far not had much mutual contact. Among the main bibliometric researchers are names such as Eugene Garfield, Henry Small and Howard D. White. Bibliometric methods are sometimes considered as being "objective", but Hjørland (2013a and 2016b) argues that this is not the case and he considers the strong and weak sides of this approach to KO.
2.7 IR approaches[modifier]
Information retrieval (IR) is, today, a term mainly related to computer science. Formerly, it had strong relations to information science, but the field has largely immigrated to computer science. Among the basic assumptions and techniques when using this approach is the study of statistical relations between terms, documents and collections of documents. Among the main IR researchers are names such as Gerald Salton, Karen Spärck Jones, Stephen Robertson and C. J. "Keith" van Rijsbergen. Again, if the purpose of a KOS is to help users to identify relevant documents, then IR is a family of competing approaches when compared to the approaches studied by the KO community. As such, it is a very successful family of approaches. Robertson (2008) stated, "statistical approaches won, simply. They were overwhelmingly more successful [compared to other approaches such as thesauri]." This issue is further addressed in, for example, Hjørland (2016a).
2.8 Other approaches[modifier]
Many other approaches exist. Here just two will be mentioned. Heinrich Herre (2013) discussed an ontological approach that provided formal specifications and harmonized the definitions of concepts used to represent the knowledge of specific domains. It made use of the onto-axiomatic method, of graduated conceptualizations, of levels of reality, and of top-level-supported methods for ontology-development.
Jack Andersen (2015) is a main representative of a genre approach to knowledge organization. He wrote:
- [A]s Bazerman (2012) reminds us, while recognizing the social importance of effective search engines and other systems of structuring knowledge and inscribing writing, we still need to understand the activity contexts of those producing and using knowledge and information because no matter how fragmentary, how automatic, and how fast information comes to a user, the very user (herself/himself placed in an activity contexts [sic!]) must ultimately make sense of the information found and that sense cannot be made without understanding the various of activity (and the practices) producing that information" (Andersen 2015, 14-15).
We have now presented an overview of the approaches to KO and of the competing approaches from outside KO. It is obvious that these, as well as other approaches, need careful considerations, and that important strategic decisions are involved in this choice of theory. The future of the field of KO is dependent on whether the research, the teaching and the practice of the future, provide helpful systems and services for given user groups, or whether existing systems like Google already provide satisfying results. A core issue is, therefore, to evaluate the relative strengths and the weaknesses of different approaches. As already stated, Hjørland (2015a) argued that for serious purposes, such as for medical decisions, classical databases are still needed and that KO needs to be further developed to make searches more efficient.
3. KO on different technological platforms[modifier]
Ideally, KO should be understood as being a knowledge base that can be applied to all technological platforms. However, its development has often been technology-driven. Therefore, an overview of KO on different platforms is provided in this section.
3.1a KO in physical libraries[modifier]
KO in libraries is mainly represented by classification systems and indexing systems such as the Dewey Decimal Classification (DDC) and the Library of Congress Subject Headings (LCSH).
Library classification systems may be developed for the double function of shelving physical documents and as a tool for information retrieval (IR) including the browsing in printed catalogs (from the 1980s in OPACs, online public access catalogs). The function as a shelving tool puts major restrictions on design of classifications because such systems must arrange all documents in a linear sequence. This double function of classification systems may be an economic and a management advance within some contexts, but it implies that the function of classifications as an IR tool is based on restrictions that are unnecessary from the retrieval perspective.
While many (big) libraries have developed tailored classifications, some systems have been used by many libraries and they may be considered to be kinds of standards. Among the best known library classification systems are the DDC (first published in 1876, 23nd edition published in 2011), the Library of Congress Classification (LCC), 1901- (regularly updated), and the Universal Decimal Classification (UDC), first published in 1905-1907 (latest "full edition" 2005).
From a research perspective, we may ask what kind of a theory underlies such a KOS? It could be said that the DDC emphasizes practicalities, efficient management, and standards rather than a scholarly, theoretical approach. It is the world's most widely used library classification system, but is not optimal to any particular collection or target group and it does not — according to, among others, James Blake (2011, 469-470) — reflect current scientific knowledge. Although Blake found that "such 'outdated' classifications may still do their job well" (2011, 470), this seems to reflect a lack of ambition in providing up-to date information, and to prioritize library management issues, rather than advanced IR requirements. DDC is probably the system which has meant most for the institutionalization and ideology of LIS and KO.
LCC was developed, based on the collections of the Library of Congress, thus reflecting this specific collection. The major principles of this system are its basis on "literary warrant" and the enumeration of classes (as opposed to facetted systems). Vanda Broughton (2004, 143) wrote, "It is quite hard to discern any strong theoretical principles underlying LCC." Some formulations by S. R. Ranganathan (e.g. 1951) have also suggested that such "traditional" systems seem to lack a theoretical foundation (in his eyes, as opposed to his own approach). The LCC and UDC reflected in the past, much better current scholarly knowledge when compared to the DDC (but the UDC scheme, in particular, has not generally been updated, cf. Hjørland 2007a). When it has been said that such systems lack a theoretical foundation, it can be argued that their implicit principles are
- that they should reflect current subject knowledge. That their theoretical basis should be found in the epistemological assumptions on which they reflect the subject fields covered;
- that they should be based on the principle of literary warrant, first formulated by Hulme (1911), which means that they are based on the literature that they classify. The LCC, in particular, is based on classifying the books in the Library of Congress, but because of the size of the collections, it has turned out to be fruitful for many other large research libraries).
Faceted library classification systems were developed in the first half of the 20th century, as opposed to enumerative systems. The LCC is the model of an enumerative system, in which all of the classes are listed (and the system is, therefore, comprehensive; LCC fills up about 41 volumes). Faceted systems, on the other hand, do not list all of their classes, but provide building blocks from which specific classes for each document may be formed (Ranganathan was inspired by the Meccano toy). While the UDC may be considered to be a forerunner partly based on facet analytic principles, the most well-known systems in this tradition are the Colon Classification (CC) developed by S. R. Ranganathan in 1933 and the Bliss Bibliographic Classification, 2nd ed. (BBC2), developed by Jack Mills, Vanda Broughton and others from 1977 (still in progress). While these systems represent a progress in research and development, their practical influences have been disappointing — although their principles have gradually influenced other systems, including the DDC.
BBC2, the CC, the DDC, the LCC and the UDC are universal systems, covering all fields of knowledge, although some (e.g., BBC2, the LCC and the UDC) may be considered sets of domain-specific systems, each of which as a whole makes up a universal system. Universal systems are less important for special libraries and for scholarly subject retrieval when compared with special systems that have been designed for subject bibliographies such as MEDLINE or PsycINFO. When online bibliographic databases developed from about 1963 (cf., Hahn 1998), the development of domain-specific thesauri for online searching became a research front in KO. However, some researchers, for example, Szostak, Gnoli & López-Huertas (2016), argue that universal systems are important for interdisciplinary research. Although research is still done on library classification and indexing systems, this area has lost importance when compared with research on other kinds of KOS that are better adapted to and used by online retrieval systems.
The most used basis for organization in universal systems has been the division (or collocation) by scholarly disciplines. The DDC, for example, states that
- [A] work on water may be classed with many disciplines, such as metaphysics, religion, economics, commerce, physics, chemistry, geology, oceanography, meteorology, and history. No other feature of the DDC is more basic than this, that it scatters subjects by discipline (Dewey 1979, p. xxxi).
The alternative principle, collocation by phenomena, has also sometimes been preferred and has been used and has its supporters (see, for example, Ahlers Møller 1981; Beghtol 2004; Brown 1914; Szostak, Gnoli & López-Huertas 2016).
During the 1980s, library catalogs became available as OPACs. This allowed users to search the catalog from remote terminals, e.g., from the users' homes. OPACs also provided better search possibilities, but to a wide extent, they continued to use the same kinds of KOS as were developed in the age of the card catalog.
3.1b KO in archives[modifier]
Archival science is an independent field with its own journals, conferences, textbooks and encyclopedias (e.g., Fox and Wilkerson 1998; Duranti and Franks 2015). Knowledge organization of archives should, however, also be considered to be a part of KO, as was defined at the beginning of this article. Archives may contain official records, business records, images, letters, diplomas, etc. The most important specific principle of organization for this domain is the principle of provenance.
- Provenance is a fundamental principle of archival science, referring to the individuals, groups, or organizations that originally created or received the items in a collection, and to the items' subsequent chain of custody. According to archival theory and the principle of provenance, records which originate from a common source (or fonds) should be kept together — either physically, or, where that is not practicable, intellectually in the way in which they are catalogued and arranged in finding aids — in accordance with what is sometimes termed the principle of archival integrity or respect des fonds. Conversely, records of different provenance should be separated. In archival practice, proof of provenance is provided by the operation of control systems that document the history of records kept in archives, including details of amendments made to them. The authority of an archival document or set of documents of which the provenance is uncertain (because of gaps in the recorded chain of custody) will be considered to be severely compromised. (Wikipedia 2016)
Archives normally collect unique objects in contrast to libraries collecting single copies of published works, of which many more copies may typically exist. (For further information about KO in archives, see also Sweeney, 2010).
3.1c KO in museums[modifier]
Museology (or museum studies) is like archival science, an independently organized field. Museums have — similar to archives and libraries — developed systems for organizing their objects and the knowledge they transmit (cf., Neilson 2010). Like archives, museums normally collect unique objects.
ICONCLASS (iconclass.nl) is an example of a subject-specific international classification system for iconographic research and the documentation of images. It contains definitions of objects, people, events, situations and abstract ideas that can be the subject of an image. It consists of a classification system with approximately 28,000 definitions, an alphabetical index, and a bibliography with 40,000 references to books and articles of an iconographical and cultural historical interest.
Ørom (2003) pointed out that the organizing principles of museum exhibitions may reflect a worldview or a scholarly paradigm, that is not only reflected in the organization of the objects in museums, but as well in the literature and in the classification systems of libraries. In other words, Ørom demonstrated a common theoretical basis of KOS.
3.2 KO in classical bibliographic databases[modifier]
The electronic scholarly databases, such as MEDLINE, PsycINFO and the Science Citation Index, developed earlier than did the OPACs, and in many ways, they represent IR-systems that are more advanced. Such "classical databases" are "records databases" (cf., Voss 2013, 79), in which each document is represented by a bibliographical record consisting of many separate fields, thus providing well-defined data. Hjørland (2015a) made the following points:
- A given record contains a mixture of fields that are derived from the document it represents, as well as information that is added by the database producer (it may contain the whole document, in addition to value-added information). It may also contain information imported from third parties, as well as user-added information, in the case of social tagging and related technologies.
- Some fields contain controlled vocabularies that have been developed by information professionals (e.g., descriptors and classification codes). Other fields contain "natural language" (i.e., the authors' language for special purposes). Many databases today include citation indexing, that is, the possibility of searching for bibliographical references, in each document that is represented.
- It is important to realize that the efficiency of given fields for optimal search strategies is relative from domain to domain (the value of searching document titles, for example, varies according to how the titles are used in the different domains. In the social sciences, for example, the use of metaphors may thus limit the value of title searches). See further in Hjørland & Kyllesbech Nielsen (2001).
- By implication, the experienced searcher should know not just about the database systems and the bibliographic records (or full-text records), but also about the concepts and the genres of the primary literature. This aspect connects information science with fields such as scholarly communication, written composition, genre studies, and language for special purposes. Whereas KO, in a narrower sense, is about the design of bibliographical records and systems of controlled vocabularies, KO in a broader sense, is about how knowledge is organized in different domains and how this can be used for IR. These broader perspectives on KO become increasingly important in the context of full-text databases and the Internet.
- The specific requirement of indexing for Boolean searches is to represent the different facets of the document that are used in the search process. Each facet is constructed by combining terms with the Boolean operator "or" and the facets are combined by the Boolean operator "and." This is known as the "building blocks search strategy" (cf., Harter 1986, 242), which in some respects, looks like the facet analytical approach mentioned above. However, faceted classifications are seldom or never used for this purpose. Perhaps the reason is that whereas the tradition of facet analysis is mainly logical or speculative, it is important to anticipate what facets need to be combined during a search. For example, in evidence based practice (EBP), the methodological facet is important (e.g., a descriptor for randomized clinical trials; which, by the way, has only been considered in MEDLINE since about 1994, after the breakthrough of the EBP paradigm). See further in Hjørland (2011a). The construction of facets should, therefore, be based on studies of researchers' criteria of relevance (which are most explicated in EBP).
The most important developments in classical databases from the perspective of KO were
- The study of the relative importance of "natural language" and "controlled vocabularies" (cf., Svenonius 1986)
- The realization that many different "subject access points" (SAP) supplement each other and that no system can guarantee a full retrieval of relevant records without noise (cf. Hjørland & Kyllesbech Nielsen 2001)
- An emphasis on the development of domain specific thesauri for IR
Classical bibliographical databases have, in general, lost importance when compared with Internet search engines. Today it is an open question whether, for example, the traditional thesaurus still has a role to fill in modern information retrieval (see Dextre Clarke and Vernau 2016). Hjørland (2015a) has argued, however, that for serious scholarly purposes, it is important that users or intermediaries are able to control the search process. For such tasks, classical databases seem to be the most advanced tools.
3.3 KO on the Internet[modifier]
The Internet and its search engines have revolutionized the way people search for and find information. When compared to classical databases, which require professional information specialists, or search competent end-users, search engines are (or seem to be) very easy to use. In addition, search engines have a broad and comprehensive coverage of many kinds of documents. The Internet has become the most important medium for organizing and searching information and documents. The field known as information architecture has been developed as a new field, which is concerned with organizing knowledge on the WWW (see e.g., Rosenfeld and Morville 1998). Its medium is new, but its basic principles are part of the field of KO, as defined at the beginning of this article.
In parallel with the development of the Internet, a new kind of KOS that has been termed "ontologies" became important from the 1990s. Dagobert Soergel wrote:
- Classification has long been used in library and information systems to provide guidance to the user in clarifying her information need and to structure search results for browsing, functions largely ignored by the text retrieval community but now receiving increasing attention in the context of helping users to cope with the vast amount of information on the Web. Fairly recently, other fields, such as AI, natural language processing, and software engineering, have discovered the need for classification, leading to the rise of what these fields call ontologies (Soergel 1999, 1119). But a classification by any other name is still a classification (Soergel 1999, 1120)
Soergel's main point of view is that ontologies are basically classification systems and that they represent a "reinvention of classification" by the new research communities, with little communication and mutual learning, in relation to the field of KO. Ontologies may, however, be considered to be more general and more abstract forms of KOS. All traditional forms, such as classification systems and thesauri, may just be understood as being restricted kinds of ontologies. Lars Marius Garshol wrote:
- [T]opic maps [ontology-based systems] can actually represent taxonomies, thesauri, faceted classification, synonym rings, and authority files, simply by using the fixed vocabularies of these classifications as a topic map vocabulary (Garshol 2004).
Given this perspective, it seems less important for KO to investigate specific forms of KOS, such as classification systems or thesauri. It seems to be important to make an abstraction to the systems of concepts and their semantic relations (Hjørland 2007b) and to understand each specific kinds of KOS, as based on principles that are general for all KOS. As a result, Hjørland (2016a) has argued that thesauri would benefit from adopting some of the principles that are used in ontologies.
From the point of view of KO as a field of research, teaching and practice, we may ask, "What are the implications for us? Is classification still needed after Google?" (cf. Hjørland 2012a). "Does the traditional thesaurus have a place in modern information retrieval?" (cf. Dextre Clarke and Vernau 2016). To answer these questions, we have to examine the potential value of different approaches. On this basis, we must estimate, how we may continue contributing, in order to make documents findable (cf. Section 2).
4. Other names and other fields[modifier]
The term KO is also used in other fields, such as cognitive psychology and information management (and is thus a homonym). This entry is about KO as related to LIS (KO in a narrow sense). KO in a broader sense is concerned with
- How knowledge is organized in society (e.g., in scholarly disciplines and in the social division of labor). This is a social KO perspective and is, in particular, relevant for disciplinary classification. An example is Oleson & Voss (1979), The Organization of Knowledge in Modern America, 1860-1920. This dimension is covered by fields, such as the sociology of knowledge and the social history of knowledge, among others.
- How knowledge is organized in scholarly theories, such as biological taxonomies. This may be termed as "intellectual classification" (as opposed to social KO). An example is Fjeldså (2013), about the classification of birds. This dimension is covered by the single sciences and in fields, such as philosophy and science studies.
This differentiation of the social and intellectual organization of knowledge is here, taken from Whitley (1984). There are, of course, mutual interactions between these social KOs and intellectual KOs. KO, in the narrow sense is dependent on KO in the broader sense (i.e., subject knowledge about an intellectual classification; for example, the classification of documents about birds reflects how birds themselves are classified).
As described elsewhere in this article, there is a tendency that different aspects of KO isolate themselves by using separate names such as "information architecture" and by forming alternative communities. One of the basic claims in this entry is, however, that the phenomena listed in the beginning of this article, have the need of being considered independently of the specific media on which they are used — and independently, of the specific traditions and methodologies, by which they have been investigated.
5. Conclusion[modifier]
KO may be understood in narrow senses, as well as in broad senses. The narrow senses are, for example, the KOS and KOPs taking place within LIS. The broad senses are, for example, the conceptual systems, the social fields, and the activity systems, existing or taking place, in all spheres of society. For us, in the KO community within LIS, the purpose of studying and teaching KO is to develop better information services, whatever that means. Different approaches and theories exist, both in and outside of KO, and it is strategically important that our teaching and our research in KO, is based on well-considered and well-informed choices. The broader kinds of KOS (e.g. activity systems and scientific theories) are important, because they form the background knowledge needed in order to organize knowledge in the narrower LIS sense (see Hjørland 2015b).
Acknowledgement[modifier]
The author would like to thank Maria Teresa Biagetti for serving as the editor of this article and to the two anonymous referees, for providing their valuable feedback.
References[modifier]
Ahlers Møller, Bente. 1981. "Subject analysis in the library. A comparative study". International Classification 8, no 1: 23-27.
Andersen, Jack. 2015. "Re-describing knowledge organization", in Genre Theory in Information Studies, edited by Jack Andersen, 13-42. Bingley, UK: Emerald Group Publishing Limited.
Anderson, James D. & Pérez-Carballo, José. 2001a. "The nature of indexing: how humans and machines analyze messages and texts for retrieval. Part I: research, and the nature of human indexing". Information Processing & Management 37, no 2: 231-254.
Anderson, James D. & Pérez-Carballo, José. 2001b. "The nature of indexing: how humans and machines analyze messages and texts for retrieval. Part II: machine indexing, and the allocation of human versus machine effort". Information Processing & Management 37, no 2: 255-277.
Bazerman, Charles. 2012. "The order of documents, the order of activity, and the order of information". Archival Science 12, no 4: 377-388.
Beghtol, Clare. 2004. "Exploring new approaches to the organization of knowledge: the subject classification of James Duff Brown". Library Trends 52, no. 4: 702-718.
Blake, James. 2011. "Some issues in the classification of zoology". Knowledge Organization 38, no 6: 463-472.
Bliss, Henry Evelyn. 1929. The organization of knowledge and the system of the sciences. New York: Henry Holt and Company.
Bliss, Henry Evelyn. 1933. The organization of knowledge in libraries and the subject-approach to books. New York: H. W. Wilson. (2. Edition 1939).
Broadfield, A. 1946. The Philosophy of Classification. London: Grafton.
Broughton, Vanda. 2004. Essential classification. London: Facet Publishing.
Brown, James Duff. 1914. Subject classification, with tables, indexes, etc., for the sub-division of subjects. 2nd rev. ed. London: Grafton.
Campbell, D. John. 1976. "A short biography of Henry Evelyn Bliss (1870-1955)". Journal of Documentation 32, no. 2: 134–145.
Dahlberg, Ingetraut. 2010. "International Society for Knowledge Organization (ISKO)", Encyclopedia of Library and Information Sciences, Third Edition, Eds. Marcia J. Bates & Mary Niles Maack (vol. IV, 2941-2949). Boca Raton, Florida: CRC Press.
Dewey, Melvil. 1979. Dewey Decimal Classification and relative index. (19th ed., Vol. 1). Albany, NJ: Forest Press.
Dextre Clarke, Stelle and Vernau, Judi (Eds.). 2016. Special Issue: The Great Debate: "This House Believes that the Traditional Thesaurus has no Place in Modern Information Retrieval." Knowledge Organization 43, no. 3: 135-214.
Dodebei, Vera. 2014. 13a ISKO International Conference, 19-22 Maio 2014, Cracóvia, Polônia: Relatório da participação da ISKO-Brasil, Maio de 2014. http://isko-brasil.org.br/wp-content/uploads/2014/06/relat_iskoCrac%C3%B3via2014.pdf
Drobnicki, John A. 1996. Bliss: The Man and the Classification. City University of New York (CUNY): CUNY Academic Works. http://academicworks.cuny.edu/cgi/viewcontent.cgi?article=1014&context=yc_pubs
Duranti, Luciana & Franks, Patricia C. 2015. Encyclopedia of archival science. Lanham: Rowman & Littlefield.
Fjeldså, Jon. 2013. "Avian classification in flux". In: Handbook of the Birds of the World, Special volume (17): New species and global index (pp. 77-146). J. del Hoyo, et al., (Eds.). Barcelona: Lynx Edicions.
Fox, Michael J. & Wilkerson, Peter L. (1998). Introduction to Archival Organization and Description. Los Angeles, CA: Getty Information Institute.
Garfield, Eugene. 1974. "The ‘Other' Immortal, A Memorable Day with Henry E. Bliss" Wilson Library Bulletin 49, no. 4: 288-292. http://www.garfield.library.upenn.edu/essays/v2p250y1974-76.pdf
Garshol, Lars Marius. 2004. "Metadata? Thesauri? Taxonomies? Topic maps! Making sense of it all". Journal of Information Science 30, no. 4: 378-391. Available online at: http://www.ontopia.net/topicmaps/materials/tm-vs-thesauri.html
Hahn, Trudi Bellardo. 1998. "Text retrieval online. Historical perspectives on Web search engines". Bulletin of the American Society for Information Science 24, no4: 7-10. http://www.asis.org/Bulletin/Apr-98/hahn.html
Harter, Stephen P. 1986. Online information retrieval. Concepts, principles, and techniques. New York: Academic Press.
Herre, Heinrich. 2013. "Formal ontology and the foundation of knowledge organization". Knowledge organization 40, no 5: 332-339.
Hjørland, Birger. 2002. "Principia Informatica. Foundational Theory of Information and Principles of Information Services". In Emerging Frameworks and Methods. Proceedings of the Fourth International Conference on Conceptions of Library and Information Science (CoLIS4). Ed. By Harry Bruce, Raya Fidel, Peter Ingwersen, and Pertti Vakkari (pp. 109-121).Greenwood Village, Colorado, USA: Libraries Unlimited.
Hjørland, Birger. 2007a. "Arguments for 'the bibliographical paradigm'. Some thoughts inspired by the new English edition of the UDC". Information Research 12, no 4: paper colis06. Available at http://InformationR.net/ir/12-4/colis/colis06.html
Hjørland, Birger. 2007b. "Semantics and Knowledge Organization". Annual review of information science and technology 41, 367-405.
Hjørland, Birger. 2008. "What is Knowledge Organization (KO)?" Knowledge Organization 35, nos 2/3: 86-101.
Hjørland, Birger. 2011a. Evidence based practice. An analysis based on the philosophy of science. Journal of the American Society for Information Science and Technology, 62, no. 7: 1301-1310.
Hjørland, Birger. 2011b. "The importance of theory of knowledge: indexing and information retrieval as an example". Journal of the American Society for Information Science and Technology 62, no 1: 72-77.
Hjørland, Birger. 2012a. "Is classification necessary after Google?" Journal of Documentation 68, no 3: 299-317.
Hjørland, Birger. 2012b. "Knowledge Organization = Information Organization?" Advances in Knowledge Organization 13: 8-14.
Hjørland, Birger. 2013a. "Citation analysis: A social and dynamic approach to knowledge organization". Information Processing and Management 49, no 6: 1313–1325.
Hjørland, Birger. 2013b. "Facet analysis: The logical approach to knowledge organization". Information Processing & Management 49: 545–557.
Hjørland, Birger. 2013c. "User-based and cognitive approaches to knowledge organization: A theoretical analysis of the research literature". Knowledge Organization 40: 11–27.
Hjørland, Birger. 2015a. "Classical databases and knowledge organization: A case for Boolean retrieval and human decision-making during searches". Journal of the Association for Information Science and Technology 66, no. 8: 1559–1575. DOI: 10.1002/asi.23250
Hjørland, Birger. 2015b. "Theories are Knowledge Organizing Systems (KOS)". Knowledge Organization 42, no 2: 113-128.
Hjørland, Birger. 2016a. "Does the Traditional Thesaurus Have a Place in Modern Information Retrieval?" Knowledge Organization 43, no. 3: 145-159.
Hjørland, Birger. 2016b. "Informetrics needs a foundation in the theory of science". In Cassidy Sugimoto (Ed.). Theories of Informetrics and Scholarly Communication (20-46). Berlin: Walter de Gruyter.
Hjørland, Birger. 2017. "Classification". Knowledge Organization 44, no. 2: 97-128. Also available at http://www.isko.org/cyclo/classification.
Hjørland, Birger & Kyllesbech Nielsen, Lykke. 2001. "Subject access points in electronic retrieval". Annual review of information science and technology 35: 249-298.
Hulme, E. Wyndham. 1911. "Principles of Book Classification". Library Association Record, 13:354-358, Oct. 1911; 389-394, Nov. 1911 & 444-449, Dec. 1911.
Jensen, Povl Johannes. 1973. Catalogue and scholarship: D. G. Moldenhawers catalogue in the Royal Library of Copenhagen. Copenhagen: The Royal Library.
Kedrow, Bonifatij M.. 1975-1976. Klassifizierung der Wissenschaften. 2 Bde. Köln: Pahl-Rugenstein.
Keyser, Pierre de. 2012. Indexing: from thesauri to the semantic Web. Oxford, UK: Chandos.
Kruk, Miroslav. 1999. The Internet and the revival of the myth of the universal library. The Australian Library Journal 48, no. 2: 137-147.
Leydesdorff, Loet. 2006. "Can scientific journals be classified in terms of aggregated journal-journal citation relations using the Journal Citation Reports?" Journal of the American Society for Information Science and Technology 57, no 5: 601-613.
Mai, Jens-Erik. 2011. "The modernity of classification". Journal of Documentation 67, no 4: 710-730.
Martínez-Ávila, Daniel. 2016. "BISAC: Book Industry Standards and Communications". Knowledge Organization 43, no. 8: 655-662. Also available at http://www.isko.org/cyclo/bisac.
Mills, Jack. 2004. "Faceted classification and logical division in information retrieval". Library trends 52, no 3: 541-570.
Neilson, Dixie. 2010. "Museum Registration and Documentation". Encyclopedia of Library and Information Sciences, Third Edition. Eds. Marcia J. Bates & Mary Niles Maack (vol. V, pp. 3739–3753). Boca Raton, Florida: CRC Press.
OECD. 2015. Frascati Manual 2015: Guidelines for Collecting and Reporting Data on Research and Experimental Development, The Measurement of Scientific, Technological and Innovation Activities. OECD Publishing, Paris. DOI: http://dx.doi.org/10.1787/9789264239012-en.
Oleson, Alexandra & Voss, John (Eds.). 1979. The Organization of knowledge in modern America, 1860-1920. Baltimore: Johns Hopkins University Press.
Ørom, Anders. 2003. "Knowledge Organization in the domain of Art Studies. History, Transition and Conceptual Changes". Knowledge Organization 30, nos 3-4: 128-143.
Pudovkin, Alexander I. & Garfield, Eugene. 2002. "Algorithmic procedure for finding semantically related journals". Journal of the American Society for Information Science and Technology 53, no 13: 1113-1119.
Ranganathan, Shiyali Ramamrita. 1951. Philosophy of library classification. Copenhagen: E. Munksgaard.
Robertson, Stephen. 2008. "The State of Information Retrieval." ISKO-UK. Presentation and audio recording freely available at http://www.iskouk.org/content/state-information-retrieval-researchers-view
Rosenfeld, Louis and Morville, Peter. 1998. Information architecture for the World Wide Web. 1st ed. Cambridge Sebastopol, CA: O'Reilly.
Samurin, Evgeniĭ Ivanovich. 1964. Geschichte der bibliotekarisch-bibliographischen Klassifikation. Band I-II. Leipzig: VEB Bibliographisches Institut.
Soergel, Dagobert. 1999. "The rise of ontologies or the reinvention of classification". Journal of the American Society for Information Science 50, no 12: 1119–1120.
Svenonius, Elaine. 1986. "Unanswered Questions in the Design of Controlled Vocabularies". Journal of the American Society for Information Science 37, no 5: 331-340.
Sweeney, Shelley. 2010. "Provenance of Archival Materials". In: Encyclopedia of Library and Information Sciences, 3. Edition, Eds. Marcia J. Bates & Mary Niles Maack (Vol. VI, pp. 4315-4323). Boca Raton, Florida: CRC Press.
Szostak, Rick; Gnoli, Claudio & López-Huertas, María. 2016. Interdisciplinary knowledge organization. Cham: Springer.
Voss, Jakob. 2013. Describing data patterns: A general deconstruction of metadata standards. Berlin: Humboldt University (Dissertation). http://edoc.hu-berlin.de/
Wallerstein, Immanuel et al. 1996. Open the Social Sciences: Report of the Gulbenkian Commission on the Restructuring of the Social Sciences. Stanford, CA: Stanford University Press.
Whitley, Richard R. 1984. The Intellectual and Social Organization of the Sciences. Oxford: Oxford University Press. (2nd ed. with a new introduction 2000).
Wikipedia, the free encyclopedia. Provenance. Retrieved 2016-07-05 from http://en.wikipedia.org/wiki/Provenance
This article (version 1.0) is published in Knowledge Organization, vol. 43 (2016), Issue 6, pp. 475-484.
How to cite it (version 1.0): Hjørland, Birger. 2016. Knowledge organization. Knowledge Organization 43, no. 6: 475-84. Also available in Hjørland, Birger, ed. ISKO Encyclopedia of Knowledge Organization, http://www.isko.org/cyclo/knowledge_organization .
©2016 ISKO. All rights reserved.