Show simple item record

dc.contributor.advisorPu, Ken
dc.contributor.advisorDavoudi, Kourosh
dc.contributor.authorStoica, Andrei
dc.date.accessioned2022-10-17T19:40:42Z
dc.date.available2022-10-17T19:40:42Z
dc.date.issued2022-09-01
dc.identifier.urihttps://hdl.handle.net/10155/1547
dc.description.abstractThis work outlines a method for performing natural language tasks as part of a relational framework. Utilizing features of PostgreSQL as a relational database and its extensibility to allow for word embedding without leaving the relational database. This system can be extended to incorporate several natural language processing (NLP) techniques, such as latent Dirichlet allocations(LDA) or modern models, such as BERT. The combination of NLP and relational operations allows for extracting data from and analyzing text in the same interface used for general data analysis. This combination allows for gathering richer information from existing sources and makes it all available from one standard interface. The declarative nature of SQL allows for more ad-hoc application of NLP techniques. Two case studies using the DBLP dataset demonstrate this integration’s power. Building an LDA model, augmenting the topic labels for greater descriptiveness, and applying preexisting models for semantic analysis.en
dc.description.sponsorshipUniversity of Ontario Institute of Technologyen
dc.language.isoenen
dc.subjectQuery languageen
dc.subjectDatabaseen
dc.subjectNatural language processingen
dc.subjectEmbedding vectorsen
dc.subjectText processingen
dc.titleUnified processing of natural language and relational dataen
dc.typeThesisen
dc.degree.levelMaster of Science (MSc)en
dc.degree.disciplineComputer Scienceen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record