Video BeDeHntF68M
Sections
Full Transcript
# Video BeDeHntF68M **Source:** [https://www.youtube.com/watch?v=BeDeHntF68M](https://www.youtube.com/watch?v=BeDeHntF68M) **Duration:** 00:08:17 ## Sections - [00:00:00](https://www.youtube.com/watch?v=BeDeHntF68M&t=0s) **Untitled Section** - ## Full Transcript
i recently bought a new shirt outside of
this darkened room i do occasionally
dress in
something other than a black tee
and that purchase was a disaster the
colors were nothing like the picture and
the fit
it was not how it was described
so
i returned it
along with a strongly worded review and
my review was one of thousands it would
take the shirt seller hours to read them
all and this is just one of many many
items of clothing they sell
fortunately there's a better way to
process vast amounts of text like
product reviews and that is through
something called
text
mining
text mining is the practice of analyzing
vast amounts of textual materials to
capture key concepts trends and hidden
relationships it's the process of
transforming unstructured text into a
structured format to identify meaningful
patterns and new insights
now unstructured and structured text
what is that
well if we break text down
there's structured
and
structured text or structured data is
standardized into a tabular format with
with rows and with columns
so this makes it very easy to process
think of like a database table or a
spreadsheet it's easy to query it's easy
to filter and to analyze
now unstructured
data
well that doesn't have a predefined
format and this includes all sorts of
text things like text documents
email messages
images videos social media posts that
sort of thing
now there's also
semi
structured
text and
that has some structure but not quite
enough to meet the requirements of a
relational database so think of like
xml or
json or something along those terms
now it turns out that something like 80
of the data in the world resides in
an
unstructured format so there's plenty of
opportunity to put text mining to work
we use text mining to generate an index
of structured concepts to be able to
answer questions like which concepts
occur together and
what do the concepts predict
to do this we'll go through four
different stages
okay so stage one that is identify
this is where we identify the text that
is to be mined and that might be a
collection of news articles or product
reviews
in stage 2
we process
the text to remove noise and to
standardize the format so this includes
doing things like removing stop words
tokenizing the words uh lemonize
lemmatizing and uh part of speech
tagging all sort of things like that are
used in the processing stage
then stage three builds the concept
and the categories
and then in stage four we analyze all of
this
to really make predictions and to
discover relationships
now first of all let's focus here on
stage two for a moment
the primary problem with the management
of all this institutional text and data
is that there are no standard rules for
writing text so that a computer can
understand it but language and
consequently the meaning varies for
every document and every piece of text
so if we take a phrase let's say
reproduction
that pen's not so good let's try this
one reproduction
of documents
how can we expand the meaning of this
what other words would be cinnamons for
reproduction
well a linguistics
based
text mining model
might suggest a couple of words for
reproduction like
copy
or it might suggest
duplication
and those look good
and that's because
linguistics-based text mining applies
the principles of natural language
processing or nlp to the analysis of
words phrases and syntax of text
an alternative to linguistics-based text
mining is statistics based text mining
and that uses calculations of frequency
to derive related terms and
statistics-based text mining tells us
that reproduction is related to the term
birth
that's going to generate some highly
irrelevant results so using nlp
to understand
the language used cuts through the
ambiguity of text making linguistic
space text mining the more reliable
approach
and it's this processing that brings us
to the category building of stage three
where the concepts and the types that
were extracted are used as the category
building blocks
when the build categories records and
documents then assigned to those
categories
we can take a look at the text that they
contain and match an element of the
categories definition and from there the
relationship discovery and the
prediction analysis is performed here
by data
mining
and data mining is a topic that we've
addressed in another video so check that
out if you want to see some more detail
now beyond sifting through product
reviews where can text mining also be
applied
well in the wider field of customer
service
text mining can be applied to work with
sentiment analysis and that can provide
a mechanism for companies to prioritize
key pain points by their customers by
processing support tickets chat bot
responses and so forth
there's also risk
management
and in risk management text mining can
provide insights around industry trends
of financial markets by monitoring
shifts in sentiment and by extracting
information from analyst reports and
white papers
and then in the field of maintenance
we can use text mining to derive
patterns that are correlated with
problems and that can be used to
generate preventative and reactive
maintenance procedures
oh and by the way that that poorly
fitted shirt that i sent back with the
scathing review well the seller sent me
a 50
discount code in addition to my refund
another happy outcome of text mining at
work
thanks for watching and please consider
to like and subscribe to our channel and
also in the comments let us know about
any other tech topics you'd like us to
cover and we can continue to bring you
the content that is relevant to you like
some of these videos here
you