Other questions in this quiz

2. Given the string s='abcccba', which of the following statements is true?

  • All the other choices are correct.
  • 'ab(c)*ba' is a regular expression for s.
  • 'abcba' is a sub-string of s.
  • 'abxba' is a regular expression for s.

3. Which of the following does not generally reduce the size of the term-document matrix?

  • Word embeddings.
  • Stemming
  • Tokenization
  • Inverted file (linked list) data structure.

4. What is the advantage of probabilistic models with respect to deterministic models for information extraction?

  • Probabilistic models are Bayesian, i.e. with high accuracy.
  • Determining models are computationally more intensive than probabilistic models.
  • Deterministic models must match expressions exactly, whereas probabilistic models are more flexible.
  • Determining models work only with characters, while probabilistic models can also handle words.

5. Which of the following is false in the context of information retrieval?

  • The IDF score ensures that we do not give much importance to terms that appear in too many documents.
  • Information retrieval is similar to classification.
  • The false positives (FP) are the non-retrieved documents, that are relevant to a query.
  • A precision-recall curve shows how the performance of an information retrieval approach is affected as we retrieve more and more documents relevant to a query.

Comments

No comments have yet been made

Similar Computing resources:

See all Computing resources »See all DM week 7 resources »