We are concerned with investigating the apparent effective-ness of Radford et al.’s “Sentiment Neuron,” which they claim encapsulates sufficient knowledge to accurately predict sentiment in reviews. In our analysis of the Sentiment Neuron, we find that the removal of the neuron only marginally affects a classifier’s ability to detect and label sentiment and may even improve performance. Moreover, the effectiveness of the Sentiment Neuron can be surpassed by simply using 100 random neurons as features to the same classifier. Using adversarial examples, we show that the generated representation containing the Sentiment Neuron (i.e., the final hidden cell state in a LSTM) is particularly sensitive to the end of a processed sequence. Accordingly, we find that caution needs to be applied when interpreting neuron-based feature representations and potential flaws should be addressed for real-world applicability.
Read the Paperarrow_back Back to
Science
Interested in hearing more from Zuva?
Read more papers
Science
On Tradeoffs Between Document Signature Methods for a Legal Due Diligence Corpus
While document signatures are a well established tool in IR, they have primarily been investigated in the context of web documents. Legal due diligence documents, by their nature, have more similar structure and language than we may expect out of standard web collections. Moreover, many due diligence systems strive to facilitate real-time interactions and so time from document ingestion to availability should be minimal. Such constraints further limit the possible solution space when identifying near duplicate documents. We present an examination of the tradeoffs that document signature methods face in the due diligence domain. In particular, we quantify the trade-off between signature length, time to compute, number of hash collisions, and number of nearest neighbours for a 90,000 document due diligence corpus.
Science
From Bubbles to Lists: Designing Clustering for Due Diligence
In due diligence, lawyers are tasked with reviewing a large set of legal documents to identify documents and portions thereof that may be problematic for a merger or acquisition. In an effort to aid users to review more efficiently, we sought to determine how document-level clustering may help users of a due diligence system during their workflow.
Following an iterative design methodology, we conducted several user studies with different versions of a document-level clustering feature consisting of three distinct phases and 27 users. We found that the interface should adapt to a user’s understanding of what “similar documents” means so that trust can be established in the feature. Furthermore, the ability to negotiate with the underlying algorithm is facilitated by the establishment of trust. Finally, while the usage of this feature may be influenced by a user’s role, it remains primarily a project management tool.