I'm currently on a multi-week mission to explore the Hugging Face (HF) code library for Transformer Architecture (TA) systems for natural language processing (NLP) and today I did a question-answer (QA) example. Whew! That's a lot of acronyms in an introductory sentence (IS)!
TA systems are extraordinarily complex, so implementing a TA system from scratch or using a low-level library like PyTorch or or Keras is only barely feasible. The HF library makes writing TA systems much simpler — with the downside that customizing a TA system built on HF can be very difficult.
My approach to learning a new technology is to 1.) get a documentation example working, 2.) refactor the example, 3.) repeat until the overall picture gels in my head.
My recent example is extractive question-answer. I set up a raw text source corpus of a few sentences from the Wikipedia article on Transformers. Then I created a BERT-based model using HF and used the model to answer the question, "How do transformers work?" The computed answer was "deep learning model that adopts the mechanism of attention, differentially weighing the significance of each part of the input data."
The moral of the story is that there are no shortcuts when it comes to learning a complex new software library or framework. You have to take one step at a time. Today was one of those steps for me.
Here are three beach photos where there is no answer to the question, "Why?"
Demo code:
# qa_test.py # Python 3.7.6 (Anaconda3-2020.02) # PyTorch 1.9.0 CPU, HugFace 4.2.2, Windows 10 # extractive question-answering using Hugging Face from transformers import AutoTokenizer, \ AutoModelForQuestionAnswering import torch as T def main(): print("\nBegin extractive question-answer using Hugging Face ") corpus = r""" A transformer is a deep learning model that adopts the mechanism of attention, differentially weighing the significance of each part of the input data. It is used primarily in the field of natural language processing (NLP) and in computer vision (CV). Like recurrent neural networks (RNNs), transformers are designed to handle sequential input data, such as natural language, for tasks such as translation and text summarization. However, unlike RNNs, transformers do not necessarily process the data in order. Rather, the attention mechanism provides context for any position in the input sequence. """ toker = \ AutoTokenizer.from_pretrained \ ("bert-large-uncased-whole-word-masking-finetuned-squad") model = \ AutoModelForQuestionAnswering.from_pretrained \ ("bert-large-uncased-whole-word-masking-finetuned-squad") quest = "How do transformers work?" print("\nThe question: ") print(quest) inpts = toker(quest, corpus, add_special_tokens=True, return_tensors="pt") inpt_ids = inpts["input_ids"].tolist()[0] oupts = model(**inpts) ans_start_scores = oupts.start_logits ans_end_scores = oupts.end_logits ans_start = T.argmax(ans_start_scores) ans_end = T.argmax(ans_end_scores) + 1 ans = \ toker.convert_tokens_to_string \ (toker.convert_ids_to_tokens(inpt_ids[ans_start:ans_end])) print("\nThe answer: ") print(ans) print("\nEnd demo ") if __name__ == "__main__": main()
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.