An ontology is a conceptual representation of a worldview and this world view is highly contextual. The scope of the world could be limited or expansive. Also, it could be a real world or an imaginary world (LIU & OZSU, 2009). Ontologies represent the world view with a graphical structure (“Intro to Ontologies – OSF Wiki,” n.d.). From a philosophical point of view, ontology is “the study of the nature of the being, becoming, existence or reality, as well as the basic categories of being and their relations” (“Ontology”, 2017, para. 1). On a close look, we find that there is a high resemblance with taxonomies and relational database schema with ontologies. All of them organize and represent domain knowledge in a hierarchical structure but ontologies can define concepts and build relations between them. The advantage of ontologies over taxonomies and relational databases is that it can define different kinds of relations in knowledge representational structure (Gasevic, Djuric, & Devedzic, 2009).
Gruber (1995) defines ontology as “an explicit specification of a conceptualization” (p. 2). It means that to represent a concept of the world some conceptualization is needed. Here “conceptualization” refers to an abstract or a simplified view of an idea. Such conceptualization could be explicit or implicit knowledge of that world or it could be any phenomenon or any topic (or topics), or any subject area of the world. The conceptualization process of a given domain (area of interest) is built upon the key elements of that domain. These key elements are; all the concepts, objects or entities belonging to that domain, and also the relationships among those concepts. “Specification” means “a formal and declarative representation”, which leads to the aspect of machine-readability of an ontology (Gasevic, Djuric, & Devedzic, 2009). To achieve this goal ontologies always use computer-usable definitions to represent the concepts. The goal of an ontology is not only to represent domain knowledge, but also to represent the knowledge in a way that it could be reusable also (if needed) (Innab, Yousef, & AI-Fayoumi, 2010). Ontologies could be described as a set of representational techniques in the field of computer and information sciences. According to Gruber (2008), ontologies use three major components to represent domain knowledge which is, “classes (or sets), attributes (or properties), and relationships (or relations among class members)” (para. 2).
The standard practice of representing domain knowledge through ontologies is to deal with the terminologies or “domain vocabularies” of that domain. An ontology uses all essential concepts and classifies them according to the need. It represents those concepts and classification through a taxonomic structure with defining their relations and domain axioms. Gasevic, Djuric, and Devedzic (2009) describe a topic in a given domain D using language L. The ontology provides a platform to enlist type of things existing in D; the types include concepts, relations, and properties of D expressed using L.
1.2 Why Ontologies?
In the age of computer applications and semantic web technologies, ontology is an essential tool to build and use of various intelligent systems. It represents domain knowledge, which is easily understandable by both humans and machines. Ontologies play a crucial role to make interoperability efficient and smooth, among heterogeneous systems. Some of the major reasons for creating ontologies are explained by Natalya F. Noy and Deborah L. McGuinness (n.d.), which are:
- “To share a common understanding of the structure of information among people or software agents.
- To enable reuse of domain knowledge.
- To make domain assumptions explicit.
- To separate domain knowledge from the operational knowledge.
- To analyze domain knowledge.” (para. 3)
There are few domains, like Natural-language Application, Database and Information Retrieval where ontologies play an important role. In the field of Natural-language Applications, ontologies are used to process natural-language and extract knowledge from scientific texts (e.g. Wordnet). Semantically rich information retrieval from a database could be done efficiently by using ontologies.
1.3 Historical Background of Ontology:
Ontology is one of the most powerful concepts in modern semantic technology, but the concept, as well as the term, is being used for several decades in different communities with widely diverse understanding. The origin of this term goes back to the Greek word “ontos”, and “logos”, which mean “being”, and “word” respectively. The term “Ontologia” was first used in the field of philosophy. Two philosophers Rudolf Goclenii in “Lexicon Philosophicum” (1613) and Jacob Lorhard in “Theatrum Philosophicum” (1613) coined the term “Ontologia” independently. The first appearance of the English term of ontology is been used in Bailey’s Dictionary in 1721, where it is defined as “an account of being in the abstract” (smith, 2015). It is often used as a synonym of “Metaphysics” by philosophers. The term “Metaphysics” isused by Aristotle’s student. Aristotle mentioned this idea (Metaphysics/Ontology) as the ‘first philosophy” (Pisanelli, De Lazzari, Innocenti, & Zanetti, 2009). So the concept of “Ontology” was there for a long time but only in recent years it gained popularity in the field of computers and became a machine-readable vocabulary. Ontology has been used to represent an area/or domain of knowledge in the field of computer and information science (Innab, Yousef, & Al-Fayoumi, 2010).
The concept of ontology came to Artificial Intelligence (AI) in the 1980’s. A group of researchers realized the power of ontologies from a different point of view, which is mathematical logic and they adopted the term “Ontology” (McCarthy, 1986). They argued that automated reasoning is possible through creating new ontologies as computational models (Hayes, 1985). According to them, the “computational ontology” is a kind of “applied philosophy”. For AI community ontology refers to two aspects, first one is “a theory of a modeled world” and the second one is “a component of knowledge systems” (Sowa, 1984).
In 1995, Guarino and Giariaretta have given a possible interpretation of the term “Ontology”. They have interpreted the term or concept of ontology with seven different points of views, which are:
i. “Ontology as a philosophical discipline
ii. Ontology as an informal conceptual system
iii. Ontology as a formal semantic account
iv. Ontology as a specification of a ‘Conceptualization’
v. Ontology as a representation of a conceptual system via a logical theory
a. characterized by specific formal properties
b. characterized only by its specific purposes
vi. Ontology as the vocabulary used by a logical theory vii. Ontology as a (meta-level) specification of a logical theory”
In the same year in another paper, Guarino explained “formal ontology”. He described ontology as the study and representation of our knowledge about “nature of world” or about any “organization” (Guarino, 1995). In 1998 Guarino describes “formal ontology” as a key player in the field of Information System and clearly explained how the concept of ontology is different in the philosophical sense and computer science domain. According to him in the philosophical sense, ontology is “a particular system of categories accounting for a certain vision of the world”. On the other hand in the domain of computer science, especially in AI, “an ontology refers to an engineering artifact, constituted by a specific vocabulary used to describe a certain reality” (Guarino, 1998, p. 4).
1.4 Definition of Ontology:
According to Tom Gruber (2009): “In the context of computer and information sciences, an ontology defines a set of representational primitives with which to model a domain of knowledge or discourse. The representational primitives are typically classes (or sets), attributes (or properties), and relationships (or relations among class members)” (p. 1).
Open Semantic Framework defines ontologies as (“Intro to Ontologies”, 2014,), “Ontologies are the structural frameworks for organizing information on the semantic Web and within semantic enterprises. They provide unique benefits in discovery, flexible access, and information integration due to their inherent connectedness; that is, their ability to represent conceptual relationships. Ontologies can be layered on top of existing information assets, which means they are an enhancement and not displacement for prior investments. And ontologies may be developed and matured incrementally, which means their adoption may be cost-effective as benefits become evident” (para. 1).
According to SemanticWeb.org (“Ontology.html”, 2012), “Ontologies are considered one of the pillars of the Semantic Web, although they do not have a universally accepted definition. A (Semantic Web) vocabulary can be considered as a special form of (usually light-weight) ontology, or sometimes also merely as a collection of URIs with an (usually informally) described meaning” (para. 1).
1.5 Components of Ontology:
Ontologies are made up of different components and it may vary from different perspectives and domains. The terminologies used to represent the components vary based on the philosophical point of view and the language used to build the ontology (Lord, 2010). Anyhow a common perception about the core components of an ontology is, there are four major components, which are; concepts, relations, instances and axioms (Stevens, 2001). Let us understand these one by one.
A. Concepts: Concepts are the set of classes with core entities in a domain. With the knowledge of these concepts, a domain could be explained or understood. According to Stevens (2001), concepts could be divided into two kinds. First one is “Primitive Concepts”, which refers to those concepts which are condition dependent. It means these concepts exist in a domain with some conditions. And the second concept is “Defined Concept”, which denotes those concepts which are absolutely necessary to explain a domain. If these concepts are missing from a particular domain, that domain knowledge cannot be explained efficiently.
B. Relations: Relations describes the “interaction between concepts or a concept’s property”. According to Steven, this component could fall into two major groups. The first group is “Taxonomical Relation”. With the help of taxonomical relation, one can arrange all concepts of a domain into sub-super-concept through tree structures. The second group is “Associative Relations”. This is the most unique component of an ontology, which helps to relate concepts across tree structures. This component gives ontology a better hand than taxonomies and relational databases in the context of defining the semantics.
C. Instances: Instances are the examples or the representations of a given concept. For example, if there is a concept called “flower” then “rose” or “lily” could be instances of that concept.
D. Axioms: In an ontology, there have to be some constraint values for either a given class or for a given instance. This constrain values are denoted by axioms.
1.6 Types of Ontologies:
There are different kinds of ontologies developed for various purposes. To categorize them, different classification approaches could be taken. Few well-known categorizations of ontologies have been given by Mizoguchi and colleagues (1995), van Heijst and colleagues (1997) Guarino (1998) and Lassila and McGuinness (2001).
According to Mizoguchi and colleagues (1995), there are four kinds of ontologies, as described below:
i. Content Ontologies: Content ontologies are the pioneers in the context of knowledge reusability. Its major goal is to facilitate reusability of knowledge across applications and across domains. This kind of ontology could be further subdivided into different categories like; “task ontologies”, “domain ontologies” and “general or common ontologies”.
ii. Communication Ontologies: This kind of ontologies are used heavily for knowledge sharing purposes. These ontologies are also referred to as “Tell & Ask Ontologies”.
iii. Indexing Ontologies: For retrieval purposes, this kind of ontologies are used.
iv. Meta-ontology: Meta-ontology is commonly referred to as a “knowledge representation ontology”. This kind of ontologies is heavily used in computer and information science domain to represent domain knowledge.
In figure 1.1 categorization of ontologies (by Mizoguchi and colleagues) has been shown.
Figure 1.1: Categorization of ontologies by Mizoguchi and Colleagues
Figure Reference: (Gomez-Perez A, 2004)
Van Heijst and colleagues (1997) classified ontologies in a completely different way. According to them there exist two “orthogonal dimensions” to classify ontologies, which are:
i. The amount and type of structure of the conceptualization: In this case, the major focus of consideration is on the structure of things. It means whether the ontology aims to conceptualize few terminologies (e.g. lexicons) or information (e.g. database) or domain knowledge (e.g. knowledge modeling).
ii. The subject of conceptualization: In this case, there are mainly four kinds of ontologies. Domain, generic, application and representational. Domain Ontologies are specific to a domain knowledge and reusable only in that domain. Generic Ontologies are not bounded to a domain, hence it could be used and reused across domains. Application Ontologies are specific to an application. Its scope is very narrow, it is non-reusable. The Representational Ontologies formalizes or represents knowledge in such a way that a computer system can understand those concepts. There are lots of applications of this kind of ontologies in the area of artificial intelligence. In figure 1.2 this categorization (by Heijst and colleagues) has been shown.
Figure 1.2: Categorization of ontologies by Heijst and Colleagues
Figure Reference: (Gomez-Perez A, 2004)
Guarino (1997) classified ontologies mostly into four classes on the basis of their level of dependence on a particular task or point of view. These different levels are:
i. Top-level Ontology: These ontologies are used for general-purpose, which is not dependent on any particular problem or domain, and generally describe concepts like object, state, action, etc. Top-level ontologies are also referred to as “Generic Ontologies” (van Heijst, Schreiber, & Wielinga, 1997), “Abstract ontologies” (Borst & Borst, 1997), “Upper (level) ontologies” (Guarino, 1998) or “Foundation(al) Ontologies” (Schneider, 2003).
ii. Domain Ontology: Domain ontologies define general domain knowledge. It does not focus on a particular task or application but tries to target a wide range of different task and application relevant to a particular domain. The major goal of this kind of ontology is applicability within the scope of particular domain expertise (Marquardt, Morbach, Wiener, & Yang, 2010).
iii. Task Ontology: Task ontologies are responsible to perform a particular task or in other words to solve a specific kind of problem. Generally, it depends on some methods which can solve a problem or complete a task. Usually, it is also referred to as “Method ontology”. Task ontologies are specific to a task but not to any particular domain. It could be reusable across domains.
iv. Application Ontology: As the name suggests these ontologies are specific to an application. The scope of this kind of ontologies is narrow. Often it is misunderstood as a knowledge-base. To solve this misconception Guarino (1997) proposed the following definition: “An application ontology comprises only state-independent information (i.e., facts that are always true), whereas a knowledge base may also hold state-dependent information (i.e., facts and assertions related to a particular state of affairs)”(p. 139-170).
These four types of ontologies are interdependent. According to Borst & Borst (1997) and Guarino (1997), top-level ontologies could be reusable in task ontology and domain ontology. In this way, it is possible to use the same world view as top-level ontologies by the other two. For task ontologies, it is very common to use the same terminologies as the top-level ontology to solve a problem/task. Domain ontologies can also describe a domain knowledge in the guidance of a top-level ontology. An application ontology may use a concept from all other kinds of ontologies. In figure 1.3 this phenomenon has been shown.
Figure 1.3: Categorization and connection of ontologies by Guarino
Figure Reference: (Gomez-Perez A, 2004)
Mizoguchi (2003) has classifiedy ontologies into two different types:
i. Light-weigrt Ontology: According to wikipedia, it is a knowledge organization system “in which concepts are connected by rather general associations than strict formal connections” (“Lightweight_ontology”, 2017, para. 1). Furt and Trichet (2006) defines it as “an ontology simply based on a hierarchy of concepts and a hierarchy of relations” (p.38). Associative network and multilingual classifications could be concidered as examples of this kind of ontology.
ii. Heavy-weight Ontology: According to Furt and Trichet (2006) it is a kind of light-weight ontology but, “enriched with axioms used to fix the semantic interpretation of concepts and relations. Such an ontology can be a domain ontology, an ontology of representation, an ontology of PSM, etc.” (p.38).
1.7 Ontology Development Methodologies:
Ontology development is referred to as ontology engineering or ontology building process. In the last two decades, several methodologies have been developed for ontology engineering. Although there are different kinds of methodologies, the basic nature of all them is more or less the same (Gasevic, Djuric, & Devedzic, 2009). These methodologies are based on some “established principles, processes and practices”. There are several surveys on ontology development methodologies. Some of the major contributors have been done by Jones et al. (1998), Corcho et al. (2003) and Staab and Studer (2009). Gasevic D et al. (2009) summarises about ontology development methodologies as:
- “most ontology development methodologies that have been proposed a focus on building ontologies;
- some other methodologies also include methods for merging, reengineering, maintaining, and evolving ontologies;
- yet other methodologies build on general software development processes and practices and apply them to ontology development;
- there are also methodologies that exploit the idea of reusing existing ontological knowledge in building new ontologies;
- some of the more recently proposed methodologies are based on the idea of using publicly available community-based knowledge to simplify and speed-up development of ontologies.” (p. 66)
There is no ‘ONE’ best way to build an ontology because ontologies are basically built to arrange domain knowledge and there could be several ways to model a domain knowledge or for different purposes. Jones et al. (1998) in their study, tried to find out the common stages of ontology building. They have noted there are manly two stages to this process. The first stage produces an informal description of the ontology. The second stage is focused on its formal embodiment in an ontology language. The Open Semantic Framework (2014) noted “The existence of these two descriptions is an important characteristic of many ontologies, with the informal description often carrying through to the formal description” (para. 4).
Noy and McGuinness (2001) have given step by step guidelines for creating ontologies. They are as follows:
i. Determine the domain and scope of the ontology: It means, before developing any ontology there should be a clear vision about the coverage of the ontology. Not only that, the purpose and goal of that ontology have to be defined beforehand. What kind of queries the ontology can answer and how to maintain the growth and development of that ontology needs to be pre-planned before starting the ontology building process.
ii. Consider reusing existing ontologies: Nowadays there are many ontology libraries available, where one can find many readymade ontologies on several topics which can be reused. So before starting the ontology development process, it is always a better idea to search for a similar ontology if someone has already created. Then it is always possible to refine and extend that ontology according to a chosen domain or task. In this case, a few issues have to be taken care of, like; language conversion, interoperability, and tool support.
iii. Enumerate important terms in the ontology: This step is about collecting all related terms and providing their definition and relations.
iv. Define the classes and the class hierarchy: This step suggests to build a hierarchical relationship with the classes. It can be done with several different approaches like (Gasevic, Djuric, & Devedzic, 2009):
a. Top-down: In this process, the focus of identifying classes goes generic to specific.
b. Bottom-up: In this process, the focus of identifying classes goes specific to generic.
c. Middle-out: In this method, the ontology building process starts with some important middle-layer concepts and then concepts from the top and bottom layers keep adding in the hierarchical structure.
v. Combination of these approaches: Combination of all or any of the three techniques mentioned above could also be a useful method for ontology building.
vi. Define the properties of classes (slots): In this part, the internal structure has to be defined. To do the job, it is necessary to explain all primary and secondary properties of a given class and establish relations among those classes.
vii. Define the facets of the slots: A class could have different kinds of values. Every facet (e.g. name, number) of those values have to be defined.
viii. Create instances: In the last step, the slot values have to be fulfilled by each instance created for the classes.
OpenSemanticFramework.org (2014) mentions about two common ontology development methodologies with two different diagrams. One of them is developed by Simperl and Colleagues (2006) (Figure 1.4)and another one is developed by Corcho and Colleagues (2003) (Figure 1.5), which are as follows:
Figure 1.4: Ontology development methodology by Simperl and Colleagues Ontology Managem
Diagram Source: http://wiki.opensemanticframework.org/index.php/Ontology_Development_MethodologiesDiagram Source: http://wiki.opensemanticframework.org/index.php/Ontology_Development_Methodologies
1.8 Ontology Development Tools:
Ontology building is a time-consuming affair. It is a challenge to implement an ontology language without any supporting tool. Hence there are several ontology development tools. According to Michael K. Bergman (2010), there are as many as 185 ontology tools available on the web. Each of the tools serves different goals. A list of popular ontology tools is enlisted in appendix A. For the purpose of this work we have used “Protégé”, which is most popular, to describe and represent an “Integrated Ontology For Information Resource Description”. Hence a small overview of Protégé is given below:
1.8.1 Protege: An Overview:
Wikipedia (2016) defines Protégé is a “free, open source ontology editor and a knowledge management system” (para. 1). It provides an environment to build domain knowledge with the help of ontologies. Protégé implements a rich set of knowledge-modeling structures and actions that support the creation, visualization, and manipulation of ontologies in various representation formats. Protégé can be customized to provide domain-friendly support for creating knowledge models and data entry. Further, the functionalities of Protégé can be extended using plug-in architecture and a Java-based Application Programming Interface (API) for building knowledge-based tools and applications.
Protégé is compliant with the W3C standards. It provides a simple, customizable user interface with several facilities like; change tracking, revision history, multiple upload/download formats and it is optimized for collaboration. Protégé is available in both desktop and web versions. The current (2017) desktop version is called Protégé 5.2.0. Protégé desktop versions are downloadable at “http://protege.stanford.edu/products.php” and it runs on any platform like; Windows, Linux or MAC. It is the leading ontology development tool with more than 300000 registered users. Features include:
- Language: Protégé supports knowledge base entry in any language or character set but the tool itself is offered in English.
- Database: Protégé supports any database that has a JDBC driver. In practice, this means most relational databases including Oracle, MySQL, Microsoft SQL Server, and Microsoft Access are supported by Protégé.
1.8.2 Why Protégé?
i. It is Open Source.
ii. It provides infrastructure to build taxonomy as well as attributes of classes.
iii. It is a very good tool to implement relationships among classes. iv. It allows using instances which leads to building a strong knowledge base.
v. It provides several output options (e.g. OWL, RDF Schema, XML, HTML).
vi. Components are reusable.
vii. It supports connecting to the databases (MySQL, PostgreSQL). Hence, one can easily extract data from the database.
viii. Different useful plug-ins (e.g. View Plug-ins: OntoGraf, Import Plug-ins: ProtegeLOV, Reasoner Plug-in: RacerProTG, etc.) are available.
ix. Supports data transport via XSL Sheet.
x. Easy to install.
xi. User-friendly documentations are available.
xii. Mail forum support is very strong.
xiii. Platform independent support.
xiv. Large and Active community.
xv. Continuously updating.
As it is mentioned in the methodology, an understanding of ontology is essential for this work. Hence, this chapter provides an overall idea of ontologies. It could be treated as a part of the “study of the subject” of this work also. The concept of ontology has been described along with different aspects of it. It started with a general understanding of ontology followed by the reasons why one should use ontology. After that, a historical perspective and definitions of ontology have been provided. An account of ontological components, types, methodologies, languages, and tools have been described to enrich the understanding of ontologies.
Original Reference Article/ Collected From:
Biswas, S. (2017). Integrated ontology for information resource description.