Research Interests:
Text Mining, Data Mining, Information Extraction, Algorithmic Trading, Social Networks.
Courses
Education
Ph.D. Cornell University, 1993
Publications
1. Ph.D. Dissertation
"Probabilistic Revision of Logical Domain Theories” (1993)
Advisors: Professor Alberto Segre.
Publications that are associated with the dissertation:
1) Papers numbers 4.1, 5.3, 5.4, 8.5-8.13
2. Books
1. Ronen Feldman and Jim Sanger. "Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data,” Cambridge University Press, Dec 2006.
In the top 10 best sellers in Machine Learning. In the Top 20 best sellers in Data Mining. (based on Amazon data)
443 citations on Google Scholar.
The Book is based on my research, Jim helped with the editing of the material.
3. Books Edited
---
4. Chapters in Peer-Reviewed Collections
1. M. Koppel, A. Segre and R. Feldman. "An Integrated Framework for Knowledge Representation and Theory Revision,” In Machine Learning and Knowledge Acquisition, pp. 95-114, G. Tecuci and Y. Kodratoff (Eds.), Academic Press, 1995.
2. R. Feldman, I. Dagan, and W. Kloesgen. "KDD Tools for Mining Associations in Textual Databases, ” Springer Lecture Notes in Computer Science, volume 1079, 96-107, 1996.
3. A. Amir, R. Feldman, and R. Kashi. "A New and Versatile Algorithm for Association Generation,” Springer Lecture Notes in Computer Science, volume 1263 , 221-231, 1997.
4. R. Feldman, W. Klösgen, Y. Ben-Yehuda, G. Kedar and V. Reznikov. "Pattern based browsing in document collections,” Springer Lecture Notes in Computer Science, volume 1263, 112-122, 1997.
5. R. Feldman and H. Hirsh. "Finding Associations in Collections of Text, ” In Methods and Applications of Machine Learning, Data Mining and Knowledge Discovery, R.S. Michalski, I. Bratko, and M. Kubat (eds.), John Wiley and Sons, Ltd., 1997, 20 pages.
6. R. Feldman, W. Kloesgen and A. Zilberstien " Document Explorer: Discovering Knowledge in Document Collections". Springer Lecture Notes in Computer Science, volume 1325, 137-146, 1997.
7. Ronen Feldman, Yonatan Aumann, Amir Zilberstein, Yaron Ben-Yehuda: Trend Graphs: Visualizing the Evolution of Concept Relationships in Large Document Collections. Springer Lecture Notes in Computer Science, volume 1510, 38-46, 1997
8. David Landau, Ronen Feldman, Yonatan Aumann, Moshe Fresko, Yehuda Lindell, Orly Liphstat, Oren Zamir: TextVis: An Integrated Visual Environment for Text Mining. Springer Lecture Notes in Computer Science, volume 1510, 56-64, 1997
9. Ronen Feldman, Moshe Fresko, Yakkov Kinar, Yehuda Lindell, Orly Liphstat, Martin Rajman, Yonatan Schler, Oren Zamir: Text Mining at the Term Level. Springer Lecture Notes in Computer Science, volume 1510, 65-73, 1997.
10. Ronen Feldman, Yonatan Aumann, Moshe Fresko, Orly Lipshtat, Binyamin Rosenfeld, Yonatan Schler: Text Mining via Information Extraction. Springer Lecture Notes in Computer Science, volume 1704, 165-173
11. Yonatan Aumann, Ronen Feldman, Yaron Ben Yehuda, David Landau, Orly Lipshtat, Yonatan Schler: Circle Graphs: New Visualization Tools for Text-Mining. Springer Lecture Notes in Computer Science, volume 1704, 277-282.
12. Ronen Feldman, Yonatan Aumann, Michal Finkelstein-Landau, Eyal Hurvitz, Yizhar Regev, Ariel Yaroshevich: A Comparative Study of Information Extraction Strategies. Springer Lecture Notes in Computer Science, volume 2276, 349-359, 2002.
13. Ronen Feldman. "Document Explorer”, Part Four, Chapter 24 in the Handbook of Data Mining and Knowledge Discovery, Oxford University Press 2002, 629-636.
14. Ronen Feldman. "Text Mining”, Part Six, Chapter 38 in the Handbook of Data Mining and Knowledge Discovery, Oxford University Press 2002, 749-757.
15. Sundar Varadarajan, Kas Kasravi, Ronen Feldman: Text-Mining: Application Development Challenges. In Proceedings of the Twenty-second SGAI International Conference on Knowledge Based Systems and Applied Artificial Intelligence, December 2002, Applications and Innovations in Intelligent Systems X, Springer-Verlag, 8 pages.
16. Ronen Feldman. "Mining Text Data”, Chapter 21 in Handbook of Data Mining, Lawrence Erlbaum Associates, 2003, 48 pages.
17. Moty Ben-Dov, Ronen Feldman: Text Mining and Information Extraction. The Data Mining and Knowledge Discovery Handbook 2005: 801-831, Springer.
18. Ronen Feldman, Suresh Govindaraj, Sangsang.Liu ,Joshua Livnat - OPTIMAL PORTFOLIO CONSTRUCTION USING QUALITATIVE AND QUANTITATIVE SIGNALS - Communication and Language Analysis in the Corporate World/Roderick P. Hart, 2013.
5. Articles in Peer-Reviewed Journals
Note: In parenthesis, the impact factor and rank of journals in the relevant area (Computer Science) at year of publication and at present; number of ISI citations; number of Google Scholar references.
1. R. Feldman and M.C. Golumbic. "Optimization algorithms for scheduling via constraint Satisfiability,” The Computer Journal, pp. 356-364, Aug. 1990. (at year of publication N.A, 2010 impact factor 1.363, 41/92, 6 citations in ISI, 19 citations in Google Scholar)
The work is based on my Maters thesis and the research was mainly mine. Martin helped with the writing
2. R. Feldman and M.C. Golumbic. "Interactive scheduling as a constraint satisfiability problem,” In Annals of Mathematics and Artificial Intelligence, pp. 49-73, Aug. 1990. (at year of publication N.A, 2010 impact factor 0.430, 0 citations in ISI, 4 citations in Google Scholar)
The work is based on my Maters thesis and the research was mainly mine. Martin helped with the writing
3. M. Koppel, R. Feldman and A. Segre "Bias-Driven Revision of Logical Domain Theories,” Journal of Artificial Intelligence Research, pp. 159-208, 1994. (at year of publication N.A, 2010 impact factor 1.691, 0 citations in ISI, 48 citations in Google Scholar, top 0.81% in CS/IS publications)
The work is based on my PhD thesis and the research was mainly mine. It was based on collaboration with Moshe Koppel and my Advisor.
4. R. Feldman, M. Koppel and A. Segre "Extending the Role of Bias in Probabilistic Theory Revision,” Knowledge Acquisition Journal, Vol. 6, pp. 197-214,1994. (at year of publication 7.29, 2 citations in ISI, 4 citations in Google Scholar)
The work is based on my PhD thesis and the research was mainly mine. It was based on collaboration with Moshe Koppel and my Advisor.
5. Amihood Amir, Ronen Feldman, Reuven Kashi: A New and Versatile Method for Association Generation. IS 22(6/7): 333-347 (1997) . (at year of publication N.A, 2010 impact factor 1.592, 22 citations in ISI, 48 citations in Google Scholar, top 30.79% in CS/IS publications)
Joint research with the other 2 authors
6. Ronen Feldman, Haym Hirsh: Exploiting Background Information in Knowledge Discovery from Text. JIIS 9(1): 83-97 (1997) (at year of publication N.A, 2010 impact factor 0.875, 0 citations in ISI, 52 citations in Google Scholar, top 19.73% in CS/IS publications)
I did most of the research and Haym helped with the writing.
7. Ronen Feldman, Ido Dagan, Haym Hirsh: Mining Text Using Keyword Distributions. JIIS 10(3): 281-300 (1998) (at year of publication N.A, 2010 impact factor 0.875, 50 citations in ISI, 89 citations in Google Scholar, top 19.73% in CS/IS publications)
It was Joint research with the other 2 authors. I did all the implementations of the system.
8. Ronen Feldman, Willi Klösgen: Data Mining on the Web: A Promising Challenge? KI 12(1): 35-36 (1998)
It was Joint research with Willi while I visited GMD
9. Yonatan Aumann, Ronen Feldman, Orly Liphstat, Heikki Mannila: Borders: An Efficient Algorithm for Association Generation in Dynamic Databases. JIIS 12(1): 61-73 (1999) (at year of publication N.A, 2010 impact factor 0.875, 18 citations in ISI, 24 citations in Google Scholar, top 19.73% in CS/IS publications)
The research was done while I visited Heikki in Helsinki. The other authors mostly helped with writing.
10. Ronen Feldman, Yizhar Regev, Michal Finkelstein-Landau, Eyal Hurvitz & Boris Kogan: Mining biomedical literature using information extraction. Current Drug Discovery, Volume2, Issue 10, pages 19-23,October 2002.
Joint work done mostly by the first 3 authors.
11. Yizhar Regev, Michal Finkelstein-Landau, Ronen Feldman: Using Rule-based Information Extraction for Locating Experimental Evidence in the Biomedical Domain – the KDD Cup 2002. KDD Explorations, December 2002, 3 pages. (at year of publication N.A, 2007 impact factor 0.58, 0 citations in ISI, 0 citations in Google Scholar, top 45.20% in CS/IS publications)
Joint work done mostly by the first 3 authors.
12. Ronen Feldman, Josuha Livnat and Ron Lazar: Earnings Guidance after Regulation FD. The Journal of Investing, 2003, 33 pages. (at year of publication N.A, 2007 impact factor NA, 0 citations in ISI, 2 citations in SSRN)
Joint work done mostly by the first 2 authors.
13. Hagit Shatkay and Ronen Feldman: Mining the Biomedical Literature in the genomic era, a review. Journal of Computational Biology, 10 (6): 821-855 (2003). (at year of publication N.A, 2010 impact factor 1.600, 105 citations in ISI, 148 citations in Google Scholar, top 28.09% in CS/IS publications)
Joint Work that combined the work of the 2 authors
14. Ronen Feldman, Yizhar Regev, Michal Finkelstein-Landau, Eyal Hurvitz & Boris Kogan, "Mining the biomedical literature using semantic analysis”, Biosilico 1(2):69-80 (2003). (at year of publication N.A, 5 citations in ISI, 21 citations in Google Scholar)
Joint work done mostly by the first 3 authors.
15. Yonatan Aumann, Amihood Amir, Ronen Feldman, Moshe Fresko, "Maximal Association Rules: a Tool for Mining Associations in Text”, J. Intell. Inf. Syst. 25(3): 333-345 (2005). (at year of publication N.A, 2010 impact factor 0.875, 5 citations in ISI, 47 citations in Google Scholar, top 19.73% in CS/IS publications)
I developed the algorithm and Yonatan helped with the formulation and writing.
16. Ronen Feldman, Benjamin Rosenfeld, Moshe Fresko, "TEG - A Hybrid Approach to Information Extraction”, KAIS, 9(1): 1-18 (2006). (at year of publication 0.833, 2010 impact factor 2.008, 11 citations in ISI, 9 citations in Google Scholar, top 45.53% in CS/IS publications)
Joint work done mostly by the first 2 authors.
6. Since last promotion
17. Ronen Feldman, Benjamin Rosenfeld, Joshua Livnat, "Reasons for Late SEC Filings: Computerized Retrieval and Classification”, Journal of Intelligent Data Analysis, 10(2): 183 - 195 (2006). (at year of publication N.A, 2010 impact factor 0.412, 0 citations in ISI, 3 citations in Google Scholar)
Joint work done mostly by the first and the third authors.
18. Yonatan Aumann, Ronen Feldman, Benjamin Rosenfeld, , Yair Liberzon, Jonathan Schler, "Visual Information Extraction”, KAIS, 10(1): 1-15. (at year of publication 0.833, 2010 impact factor 2.008, 4 citations in ISI, 11 citations in Google Scholar, top 45.53% in CS/IS publications)
Joint work done mostly by the first 3 authors.
19. Gregory Piatetsky-Shapiro, Robert Grossman, Chabane Djeraba, Ronen Feldman, Lise Getoor, Mohammed Zaki: "What are the grand challenges for data mining?”, KDD Explorations, Vol 8, Issue 2, 70-77, Dec 2006. (at year of publication N.A, 2007 impact factor 0.58, 0 citations in ISI, 20 citations in Google Scholar, top 45.20% in CS/IS publications)
Based on a KDD-2006 panel were all authors participated.
20. Ronen Feldman, Yizhar Regev, Maya Gorodetsky: A modular information extraction system. Intell. Data Anal. 12(1): 51-71 (2008). (at year of publication 0.428, 2010 impact factor 0.412, 1 citations in ISI, 6 citations in Google Scholar)
Joint work done mostly by the first 2 authors.
21. Benjamin Rosenfeld, Ronen Feldman: Self-supervised relation extraction from the Web. Knowl. Inf. Syst. Journal 17(1): 17-33 (2008). (at year of publication 1.733, 2010 impact factor 2.008, 2 citations in ISI, 18 citations in Google Scholar, top 45.53% in CS/IS publications)
Joint research done by the authors. Benjamin implemented the system
22. Ronen Feldman; Joshua Livnat; Benjamin Segal : Shorting Companies That Restate Previously Issued Financial Statements. Journal of Investing, Vol. 17, No. 3: 2008, 6-15.
Joint work done mostly by the first 2 authors.
23. Ronen Feldman, Suresh Govindaraj, Joshua Livnat.: "Management's Tone Change, Post Earnings Announcement Drift and Accruals”. REVIEW OF ACCOUNTING STUDIES Volume: 15 Issue: 4 Pages: 915-953, Published: DEC 2010 (at year of publication 1.972, 3 citations in ISI, 25 citations in Google Scholar)
Joint work done mainly by the first and third authors.
24. Ronen Feldman, Joshua Livnat. Yuan Zhang: "Analysts' Earnings Forecast, Recommendation and Target Price Revisions”. The Journal of Portfolio Management Spring 2012, Vol. 38, No. 3: pp. 120-132. (at year of publication 0.9, 0 citations in ISI, 0 citations in Google Scholar)
Joint work done by all authors
25. Oded Netzer, Ronen Feldman, Moshe Fresko, Jacob Goldenberg,. Mine Your Own Business: Market Structure Surveillance Through Text Mining. Marketing Science 31 (3), 521-543 (at year of publication 2.194, 2 citations in ISI, 10 citations in Google Scholar)
Joint work done mostly by the first, second and fourth authors.
26. Ronen Feldman: Techniques and applications for sentiment analysis. Commun. ACM 56(4): 82-89 (2013). (at year of publication 1.9, 1 citations in ISI, 7 citations in Google Scholar)
In Preparation
1. Ronen Feldman, Suresh Govindaraj, Joshua Livnat and Christine Petrovits : Do Managers "Put Their Money Where Their Mouth Is"? and Should Investors Follow Suit?
Joint work done by all authors.
2. Ronen Feldman, Suresh Govindaraj, Joshua Livnat: Is The Accruals Anomaly Dead? A Premature Death Certificate.
Joint work done by all authors.
7. Conference Proceedings
1. R. Feldman and M.C. Golumbic. "Interactive Scheduling as a Constraint Labeling Problem,” In Proceedings of the 4th Israeli Symposium on Artificial Intelligence, pp. 136-145, December 1987, Ramat-Gan, Israel.
Based on my M.Sc research
2. R. Feldman and M.C. Golumbic "Constraint Satisfiability algorithms for interactive student scheduling,” In Proceedings of IJCAI-89, pp. 1010-1016, Aug. 1989, Detroit, MI. (CiteSeer impact factor 1.82, 11 citations in Google Scholar, top 4.09%% in CS/IS publications)
Based on my M.Sc research
3. D. Subramanian and R. Feldman "The Utility of EBL in Recursive Domains,” In Proceedings of AAAI-90, pp. 942-949, July 1990 ,Boston, MA. (CiteSeer impact factor 1.49, 32 citations in Google Scholar, top 9.17% in CS/IS publications)
Based on my Ph.D. research
4. R. Feldman and D. Subramanian. "Example Guided Optimization of Recursive Domain Theories,” In Proceedings of IEEE AI Applications Conference., pp. 240-244, Feb. 1991, Miami Beach, FL. (CiteSeer impact factor NA, 1 citations in Google Scholar)
Based on my Ph.D. research
5. R. Feldman, A. Segre and M. Koppel. "Incremental Refinement of Approximate Domain Theories,” In Proceedings of the 8th Intl. Machine Learning Conference, pp. 500-504, June 1991 Evanston, IL. (CiteSeer impact factor 2.12, 11 citations in Google Scholar, top 1.88% in CS/IS publications)
Based on my Ph.D. research
6. R. Feldman, A. Segre and M. Koppel. "Refinement of Approximate Rule Bases,”, In Proceedings of the World Congress on Expert Systems, pp. 615-622, Dec. 1991 Orlando, FL.
Based on my Ph.D. research
7. R. Feldman, M. Koppel and A. Segre "Probabilistic Revision of Logical Domain Theories,” In Working Notes of AAAI Spring Symposium on Knowledge Assimilation, pp. 51-61, March 1992, Stanford, CA.
Based on my Ph.D. research
8. R. Feldman, M. Koppel and A. Segre "Probabilistic Revision of Propositional Domain Theories,” In Proceedings of the 9th Israeli Symposium on Artificial Intelligence, pp. 132-146, December 1992, Ramat-Gan, Israel.
Based on my Ph.D. research
9. R. Feldman, M. Koppel and A. Segre "The Relevance of Bias in the Revision of Approximate Domain Theories,” In Proceedings of IJCAI-93 workshop on Knowledge Acquisition and Machine Learning, pp. 44-60, August 1993, Chambery, France. . (2 citations in Google Scholar)
Based on my Ph.D. research
10. M. Koppel, R. Feldman and A. Segre "Theory Revision Using Noisy Exemplars,” In Proceedings of the 10th Israeli Symposium on Artificial Intelligence, pp. 96-107, December 1993, Ramat-Gan, Israel.
Based on my Ph.D. research
11. R. Feldman and C. Nedellec "A Framework for Specifying Explicit Bias for Revision of approximate Knowledge Bases,” In Proceedings of the 7th International Conference on Knowledge Acquisition, chapter 15, pp. 1-20, Banff, Canada, Feb 1994.
Research done while I was visiting u of Paris
12. M. Koppel, A. Segre and R. Feldman. "Getting the Most from a Flawed Theory,” 9th International Machine Learning Conference, pp. 139-147, Rutgers, NJ, June 1994. (CiteSeer impact factor 2.12, 4 citations in Google Scholar, top 1.88% in CS/IS publications)
Based on my Ph.D. research
13. R. Feldman. "FRST - An Interactive Revision System for Forward Chaining Rule Bases,” In Proceedings of ECAI workshop on integration of Knowledge Acquisition and Machine Learning, Amsterdam, Holland, Aug 1994.
14. R. Feldman and I. Dagan. "Knowledge Discovery in Texts,” In Proceedings of the ECML-95 Workshop on Knowledge Discovery, pp. 175-180, Crete, Greece, May 1995.
Joint work, I did all system design and implementation
15. R. Feldman and I. Dagan. "Knowledge Discovery in Textual Databases (KDT),” In Proceedings of the 1st International Conference on Knowledge Discovery (KDD-95), pp. 112-117, Montreal, Aug 1995. (CiteSeer impact factor 1.68, 205 citations in Google Scholar, top 6.14% in CS/IS publications)
Joint work, I did all system design and implementation
16. S. Engelson, R. Feldman, M. Koppel, A. Nerode, J. Remmel "FROST - A Forward Chaining Rule Ordering System for Reasoning with Nonmonotonic Rule Systems,” In Proceedings of the IJCAI-95 workshop on Implementation of Nonmonotonic Systems, pp. 27-36, Montreal, Aug 1995.
Mostly my work
17. R. Feldman, I. Dagan, and W. Kloesgen. "Efficient Algorithms for Mining and Manipulating Associations in Texts,” In Proceedings of EMCSR96, pp. 949-954, Vienna, Austria, April 1996.
Joint work, I did all system design and implementation
18. I. Dagan , R. Feldman., and H. Hirsh. "Keyword-Based Browsing and Analysis of Large Document Sets, ” In Proceedings of SDAIR96, pp. 191-208, Las Vegas, Nevada April 1996.
Joint work, I did all system design and implementation
19. R. Feldman, "The KDT System - Using Prolog for KDD, ” In Proceedings of the 4th Conference on Practical Applications of Prolog, pp. 91-110, London, April 1996.
20. R. Feldman and H. Hirsh. "Mining Associations in Text in the Presence of Background Knowledge, ” In Proceedings of the 2nd International Conference on Knowledge Discovery (KDD-96), pp. 343-346, Portland, Aug 1996. (CiteSeer impact factor 2.12, 94 citations in Google Scholar, top 1.88% in CS/IS publications)
Joint work, I did all system design and implementation
21. R. Feldman, A. Amir, Y. Aumann, A. Zilberstein, and H. Hirsh. "Incremental Algorithms for Association Generation, ” In Proceedings of the 1st Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD97), 14 pages, Singapore, 1997.
Joint work, I did all system design and implementation
22. R. Feldman, Y. Aumann, A. Amir and H. Mannila. "Efficient Algorithms for Discovering Frequent Sets in Incremental Databases”, In SIGMOD'97 Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD'97), 12 pages, AZ, USA, 1997.
Joint work, I did all system design and implementation
23. R. Feldman, Y. Aumann, A. Amir, W. Kloesgen and A. Zilberstien "Maximal Association Rules: a New Tool for Mining for Keyword co-occurrences in Document Collections”, In Proceedings of the 3rd International Conference on Knowledge Discovery (KDD-97), Newport Beach, 167-170, CA, Aug 1997. (CiteSeer impact factor 1.68, 47 citations in Google Scholar, top 6.14% in CS/IS publications)
Joint work, I did all system design and implementation
24. R. Feldman, Y. Aumann, A. Amir, W. Kloesgen and A. Zilberstien "Visualization Techniques to Explore Data Mining Results for Document Collections”, In Proceedings of the 3rd International Conference on Knowledge Discovery (KDD-97), 16-23, Newport Beach, CA, Aug 1997. (CiteSeer impact factor 1.68, 37 citations in Google Scholar, top 6.14% in CS/IS publications)
Joint work, I did all system design and implementation
25. Ronen Feldman: Mining Unstructured Data. KDD Tutorial Notes 1999: 182-236
26. Ronen Feldman, Yair Liberzon, Binyamin Rosenfeld, Jonathan Schler, Jonathan Stoppi: A framework for specifying explicit bias for revision of approximate information extraction rules. KDD 2000: 189-197. (CiteSeer impact factor 1.68, 15 citations in Google Scholar, top 6.14% in CS/IS publications)
Joint work, I did all system design and implementation, and extension of my PhD work.
27. Ronen Feldman, Yonatan Aumann, Yair Liberzon, Kfir Ankori, Jonathan Schler, Benjamin Rosenfeld: A Domain Independent Environment for Creating Information Extraction Modules. CIKM 2001: 586-588. (CiteSeer impact factor 0.73, 9 citations in Google Scholar, top 35.87% in CS/IS publications)
Joint work of mainly the first, second and last authors.
28. Benjamin Rosenfeld, Ronen Feldman, Yonatan Aumann: Structural Extraction from Visual Layout of Documents. CIKM 2002, 203-210. (CiteSeer impact factor 0.73, 10 citations in Google Scholar, top 35.87% in CS/IS publications)
Mainly joint work of the first two authors.
29. Benjamin Rosenfeld, Ronen Feldman, Moshe Fresko, Jonathan Schler, Yonatan Aumann: TEG - A Hybrid Approach to Information Extraction, CIKM 2004, 589-596. (CiteSeer impact factor 0.73, 9 citations in Google Scholar, top 35.87% in CS/IS publications)
Joint work of mainly the first two authors.
30. Ronen Feldman, Benjamin Rosenfeld, Moshe Fresko, Brian Davison, "Hybrid Semantic Tagging for Information Extraction”, WWW'05, Japan, 1022-1023.
Joint work of mainly the first two authors.
31. Benjamin Rosenfeld, Moshe Fresko, Ronen Feldman, A Systematic Comparison of Feature-Rich Probabilistic Classifiers for NER Tasks, PKDD 2005, Lecture Notes in Computer Science, Volume 3721, Nov 2005, Pages 217 – 227. (CiteSeer impact factor 0.50, 2 citations in Google Scholar, top 51.26% in CS/IS publications)
Joint work – based on the PhD thesis of the 2nd author
32. Moshe Fresko, Binyamin Rosenfeld, Ronen Feldman. "A Hybrid Approach to NER by MEMM and Manual Rules,” CIKM, 2005, Bremen, Germany, 361-362. (CiteSeer impact factor 0.73, 9 citations in Google Scholar, top 35.87% in CS/IS publications)
Joint work – based on the PhD thesis of the 1st author
33. Moshe Fresko, Binyamin Rozenfeld, and Ronen Feldman. "A Hybrid Approach to NER by Integrating Manual Rules into MEMM,” AI and Math 2006, Ft. Lauderdale, Florida, 7 pages.
Joint work – based on the PhD thesis of the 1st author
34. Binyamin Rosenfeld, Ronen Feldman, Fresko Moshe "A Systematic Cross-Comparison of Sequence Classifiers,” SDM 2006, Maryland, USA, Pages 563-567. (4 citations in Google Scholar)
Joint work of mainly the first two authors.
35. Binyamin Rosenfeld and Ronen Feldman. URES: an Unsupervised Web Relation Extraction System. in Proceedings of the 44th Meeting of the Association for Computational Linguistics(ACL). July 2006. Sydney, Australia, 667-674. (CiteSeer impact factor 1.44, 3 citations in Google Scholar, top 10.15% in CS/IS publications)
Joint Work
36. Ronen Feldman and Binyamin Rosenfeld. "Boosting Unsupervised Relation Extraction by Using NER,” in Proceedings of EMNLP-06, 11th Conference on Empirical Methods in Natural Language Processing. July 2006. Sydney, Australia, 473-481.
Joint Work
37. Gregory Piatetsky-Shapiro, Robert Grossman, Chabane Djeraba, Ronen Feldman, Lise Getoor, Mohammed Zaki: "Is there a grand challenge or X-prize for data mining?” KDD 2006, August 2006, Pages 954-956.
Based on panel in KDD-2006 by all authors.
38. Ronen Feldman and Binyamin Rosenfeld. "Self-Supervised Relation Extraction from the Web,” in Proceedings of ISMIS-2006, Sept 2006. Bari, Italy, 755-764. (CiteSeer impact factor 0.33, 6 citations in Google Scholar, top 62.16% in CS/IS publications)
Joint Work
Since Last Promotion
39. Benjamin Rosenfeld, Ronen Feldman, "High-Performance Unsupervised Relation Extraction from Large Corpora,” in Proceedings of ICDM-06, IEEE International Conference on Data Mining, Hong Kong, Dec 2006, 1032-1037. (CiteSeer impact factor 0.35, 9 citations in Google Scholar, top 59.86% in CS/IS publications)
Joint Work
40. Shaul Ben Michael, Ronen Feldman: Visual Query and Exploration System for Temporal Relational Database. Industrial Conference on Data Mining 2007: 283-295
Based on the M.Sc thesis of the 1st author
41. Ronen Feldman, Moshe Fresko, Jacob Goldenberg, Oded Netzer, Lyle H. Ungar: Extracting Product Comparisons from Discussion Boards. ICDM 2007: 469-474. (CiteSeer impact factor 0.35, 17 citations in Google Scholar, top 59.86% in CS/IS publications)
Joint work of the first 4 authors
42. Benjamin Rosenfeld, Ronen Feldman: Clustering for unsupervised relation identification. CIKM 2007: 411-418. (CiteSeer impact factor 0.73, 30 citations in Google Scholar, top 35.87% in CS/IS publications)
Joint Work
43. Benjamin Rosenfeld, Ronen Feldman: Using Corpus Statistics on Entities to Improve Semi-supervised Relation Extraction from the Web. ACL 2007. (CiteSeer impact factor 1.44, 22 citations in Google Scholar, top 10.15% in CS/IS publications)
Joint Work
44. Binyamin Rosenfeld, Ronen Feldman, Lyle H. Ungar: Using sequence classification for filtering web pages. CIKM 2008: 1355-1356. (CiteSeer impact factor 0.73, 9 citations in Google Scholar, top 35.87% in CS/IS publications)
Mainly joint work of the first 2 authors
45. Ronen Feldman, Moshe Fresko, Jacob Goldenberg, Oded Netzer, Lyle H. Ungar: Using Text Mining to Analyze User Forums, WMEE'08, Melbourne, Australia, 2008.
Joint work of the first 4 authors
Joint work by all authors
Work done mainly by the first three authors.
(CiteSeer impact factor 1.2, 0 citations in Google Scholar, top 15.64% in CS/IS publications)
Joint Work by all authors
(CiteSeer impact factor 1.82, 0 citations in Google Scholar, top 4.09%% in CS/IS publications)
49. Boudoukh, Jacob and Feldman, Ronen and Kogan, Shimon and Richardson, Matthew P., Which News Moves Stock Prices? A Textual Analysis. WFA 2013, EFA 2013. Available at SSRN: http://ssrn.com/abstract=2193667 or http://dx.doi.org/10.2139/ssrn.2193667
Patents
1. Ronen Feldman, Yonatan Aumann, Yonatan Schler, David Landau, Orly Lipshtat, Yaron Ben Yehuda: US Patent 6,442,545, "Term Level Text Mining with Taxonomies”, Aug 27, 2002.
2. Ronen Feldman, Yonatan Aumann, Yaron Ben Yehuda, David Landau: US Patent 6,532,469, "Determining Trends using Text Mining”, Mar 11th, 2003.
3. David Landau, Ronen Feldman, Yonatan Aumann, Orly Lipshtat, Hadar Shemtov: US Patent Application: 7,570,262, "Method and system for displaying time-series data and correlated events derived from text mining”, August 4, 2009.
4. David Landau, Ronen Feldman, Yonatan Aumann, Orly Lipshtat, Hadar Shemtov: US Patent Application: 7,907,140, "Displaying time-series data and correlated events derived from text mining”, March 15, 2011.
Patents Pending
1. Ronen Feldman, Benjamin Rosenfeld, Yair Liberzon: US Patent Application 20,060,253,273, "Information extraction using a trainable grammar”, November 9, 2006.
Magazine Articles
1. Ronen Feldman, "Unified Business Intelligence: Voices of BI”, DM review, February 2005.
2. Ronen Feldman, "Unified Business Intelligence: Voices from the Next Frontier”, DM review, March 2005.
3. Ronen Feldman, "Unified Business Intelligence: Three Vs: Best Practices for a Unified Business Intelligence Infrastructure”, DM review, April 2005.
4. Ronen Feldman, "Unified Business Intelligence: Managing Risk for the Financial Services Market in a World of Uncertainty”, DM review, June 2005.
5. Ronen Feldman, "Unified Business Intelligence: Defining Alliances in the Post-Unified Business Intelligence World”, DM review, July 2005.
6. Ronen Feldman, "Unified Business Intelligence: UIMA - Is IBM Hearing Voices?”, DM review, August 2005.
Ronen Feldman
Jerusalem Index (based on the CORE index for CS publications)
Feb 2012 Articles in Peer-Reviewed Journals
1. R. Feldman and M.C. Golumbic. "Optimization algorithms for scheduling via constraint Satisfiability,” The Computer Journal, pp. 356-364, Aug. 1990.
Jerusalem Index – A
2. R. Feldman and M.C. Golumbic. "Interactive scheduling as a constraint satisfiability problem,” In Annals of Mathematics and Artificial Intelligence, pp. 49-73, Aug. 1990. (at year of publication N.A, 2007 impact factor 0.756, 0 citations in ISI, 4 citations in Google Scholar)
Jerusalem Index – A
3. M. Koppel, R. Feldman and A. Segre "Bias-Driven Revision of Logical Domain Theories,” Journal of Artificial Intelligence Research, pp. 159-208, 1994. (at year of publication N.A, 2007 impact factor 2.45, 0 citations in ISI, 48 citations in Google Scholar, top 0.81% in CS/IS publications)
Jerusalem Index – A
4. R. Feldman, M. Koppel and A. Segre "Extending the Role of Bias in Probabilistic Theory Revision,” Knowledge Acquisition Journal, Vol. 6, pp. 197-214,1994. (at year of publication 7.29, 0 citations in ISI, 4 citations in Google Scholar)
Jerusalem Index – B
5. Amihood Amir, Ronen Feldman, Reuven Kashi: A New and Versatile Method for Association Generation. IS 22(6/7): 333-347 (1997) . (at year of publication N.A, 2007 impact factor 0.83, 0 citations in ISI, 48 citations in Google Scholar, top 30.79% in CS/IS publications)
Jerusalem Index – A
6. Ronen Feldman, Haym Hirsh: Exploiting Background Information in Knowledge Discovery from Text. JIIS 9(1): 83-97 (1997) (at year of publication N.A, 2007 impact factor 1.08, 0 citations in ISI, 52 citations in Google Scholar, top 19.73% in CS/IS publications)
Jerusalem Index – B
7. Ronen Feldman, Ido Dagan, Haym Hirsh: Mining Text Using Keyword Distributions. JIIS 10(3): 281-300 (1998) (at year of publication N.A, 2007 impact factor 1.08, 0 citations in ISI, 89 citations in Google Scholar, top 19.73% in CS/IS publications)
Jerusalem Index – B
8. Ronen Feldman, Willi Klösgen: Data Mining on the Web: A Promising Challenge? KI 12(1): 35-36 (1998)
Jerusalem Index – NA
9. Yonatan Aumann, Ronen Feldman, Orly Liphstat, Heikki Mannila: Borders: An Efficient Algorithm for Association Generation in Dynamic Databases. JIIS 12(1): 61-73 (1999) (at year of publication N.A, 2007 impact factor 1.08, 0 citations in ISI, 24 citations in Google Scholar, top 19.73% in CS/IS publications)
Jerusalem Index – B
10. Ronen Feldman, Yizhar Regev, Michal Finkelstein-Landau, Eyal Hurvitz & Boris Kogan: Mining biomedical literature using information extraction. Current Drug Discovery, Volume2, Issue 10, pages 19-23,October 2002.
Jerusalem Index – NA
11. Yizhar Regev, Michal Finkelstein-Landau, Ronen Feldman: Using Rule-based Information Extraction for Locating Experimental Evidence in the Biomedical Domain – the KDD Cup 2002. KDD Explorations, December 2002, 3 pages. (at year of publication N.A, 2007 impact factor 0.58, 0 citations in ISI, 0 citations in Google Scholar, top 45.20% in CS/IS publications)
Jerusalem Index – NA
12. Ronen Feldman, Josuha Livnat and Ron Lazar: Earnings Guidance after Regulation FD. The Journal of Investing, 2003, 33 pages. (at year of publication N.A, 2007 impact factor NA, 0 citations in ISI, 2 citations in SSRN)
Jerusalem Index – NA
13. Hagit Shatkay and Ronen Feldman: Mining the Biomedical Literature in the genomic era, a review. Journal of Computational Biology, 10 (6): 821-855 (2003). (at year of publication N.A, 2007 impact factor 0.9, 0 citations in ISI, 148 citations in Google Scholar, top 28.09% in CS/IS publications)
Jerusalem Index – A
14. Ronen Feldman, Yizhar Regev, Michal Finkelstein-Landau, Eyal Hurvitz & Boris Kogan, "Mining the biomedical literature using semantic analysis”, Biosilico 1(2):69-80 (2003).
Jerusalem Index – NA
15. Yonatan Aumann, Amihood Amir, Ronen Feldman, Moshe Fresko, "Maximal Association Rules: a Tool for Mining Associations in Text”, J. Intell. Inf. Syst. 25(3): 333-345 (2005). (at year of publication N.A, 2007 impact factor 1.08, 0 citations in ISI, 47 citations in Google Scholar, top 19.73% in CS/IS publications)
Jerusalem Index – B
16. Ronen Feldman, Benjamin Rosenfeld, Moshe Fresko, "TEG - A Hybrid Approach to Information Extraction”, KAIS, 9(1): 1-18 (2006). (at year of publication N.A, 2007 impact factor 0.844, 0 citations in ISI, 9 citations in Google Scholar, top 45.53% in CS/IS publications)
Jerusalem Index – A
Since last promotion
17. Ronen Feldman, Benjamin Rosenfeld, Joshua Livnat, "Reasons for Late SEC Filings: Computerized Retrieval and Classification”, Journal of Intelligent Data Analysis, 10(2): 183 - 195 (2006). (at year of publication N.A, 2010 impact factor 0.412, 0 citations in ISI, 3 citations in Google Scholar)
Jerusalem Index – B
18. Yonatan Aumann, Ronen Feldman, Benjamin Rosenfeld, , Yair Liberzon, Jonathan Schler, "Visual Information Extraction”, KAIS, 10(1): 1-15. (at year of publication 0.833, 2010 impact factor 2.008, 4 citations in ISI, 11 citations in Google Scholar, top 45.53% in CS/IS publications)
Jerusalem Index – A
19. Gregory Piatetsky-Shapiro, Robert Grossman, Chabane Djeraba, Ronen Feldman, Lise Getoor, Mohammed Zaki: "Is there a grand challenge or X-prize for data mining?”, KDD Explorations, Vol 8, Issue 2, 70-77, Dec 2006. (at year of publication N.A, 2007 impact factor 0.58, 0 citations in ISI, 20 citations in Google Scholar, top 45.20% in CS/IS publications)
Jerusalem Index – NA
20. Ronen Feldman, Yizhar Regev, Maya Gorodetsky: A modular information extraction system. Intell. Data Anal. 12(1): 51-71 (2008). (at year of publication 0.428, 2010 impact factor 0.412, 1 citations in ISI, 6 citations in Google Scholar)
Jerusalem Index – B
21. Benjamin Rosenfeld, Ronen Feldman: Self-supervised relation extraction from the Web. Knowl. Inf. Syst. Journal 17(1): 17-33 (2008). (at year of publication 1.733, 2010 impact factor 2.008, 2 citations in ISI, 18 citations in Google Scholar, top 45.53% in CS/IS publications)
Jerusalem Index – A
22. Ronen Feldman; Joshua Livnat; Benjamin Segal : Shorting Companies That Restate Previously Issued Financial Statements. Journal of Investing, Vol. 17, No. 3: 2008, 6-15.
Jerusalem Index – NA
23. Ronen Feldman, Suresh Govindaraj, Joshua Livnat. Benjamin Segal: "Management's Tone Change, Post Earnings Announcement Drift and Accruals”. Review of Accounting Studies Journal 15(4): 2008, 915-953. (at year of publication 1.972, 3 citations in ISI, 25 citations in Google Scholar)
Jerusalem Index – A
24. Ronen Feldman, Joshua Livnat. Yuan Zhang: "Analysts' Earnings Forecast, Recommendation and Target Price Revisions”. Accepted to Journal of Portfolio Management (31 pages). (at year of publication 0.9, 0 citations in ISI, 0 citations in Google Scholar)
Jerusalem Index – B
25. Oded Netzer, Ronen Feldman, Moshe Fresko, Jacob Goldenberg,. Mine Your Own Business: Market Structure Surveillance Through Text Mining. Accepted to Marketing Science. (at year of publication 2.194, 0 citations in ISI, 7 citations in Google Scholar)
Jerusalem Index – A&l
Current Research
The information age has made it easy to store large amounts of data. The proliferation of documents available on the Web, on corporate intranets, on news wires, and elsewhere is overwhelming. However, while the amount of data available to us is constantly increasing, our ability to absorb and process this information remains constant. Search engines only exacerbate the problem by making more and more documents available in a matter of a few key strokes. Text Mining [1, 2] is a new (I wrote the first paper on Text Mining [1]) and exciting research area that tries to solve the information overload problem by using techniques from data mining, machine learning, NLP, IR and knowledge management. Text Mining involves the preprocessing of document collections (text categorization, information extraction, term extraction), the storage of the intermediate representations, the techniques to analyze these intermediate representations (distribution analysis, clustering, trend analysis, association rules etc) and visualization of the results. My research is centered around the various components of text mining. In the following sections I will describe the various research activities that I have done in the recent years and plans for future research. My main motto in research is the combination of theory and practice and indeed in each of the following areas we have developed a complete theory and proved that it actually works in practice by implementing a large scale system based on the theory. In particular I am trying now to utilize Text Mining for the benefit of the various areas of Business Administration (Accounting, Finance, Marketing and Strategy)
Hybrid Information Extraction
The knowledge engineering (mostly rule-based) systems traditionally were the top performers in most IE benchmarks[2], such as MUC, ACE and the KDD CUP. Recently, though, the machine learning systems became state of the art, especially for simpler tagging problems, such as named entity recognition or field extraction. Still, the knowledge-engineering approach retains some of its advantages. It is focused around manually writing patterns to extract the entities and relations. The patterns are naturally accessible to human understanding and can be improved in a controllable way. Whereas improving the results of a pure machine-learning system would require providing it with additional training data. However, the impact of adding more data soon becomes infinitesimal while the cost of manually annotating the data grows linearly. We have developed a hybrid entities- and relations-extraction system, which combines the power of knowledge-based and statistical machine-learning approaches. The system is based on stochastic context-free grammars. It is called TEG[2, 3], for trainable extraction grammar. The rules for the extraction grammar are written manually, while the probabilities are trained from an annotated corpus. The powerful disambiguation ability of PCFGs allows the knowledge engineer to write very simple and naive rules while retaining their power, thus greatly reducing the required labor. In addition, the size of the needed training data is considerably smaller than the size of the training data needed for pure machine-learning systems (for achieving comparable accuracy results). Furthermore, the tasks of rule writing and corpus annotation can be balanced against each other.
Plans for Future Research. TEG is based on the combination of HMM and CFG. We conjecture that by combining a stronger machine learning algorithm such as CRF, RMM or MIRA with CFG [4, 5] we can create a more powerful hybrid solution. It is our goal to create a hybrid solution that will solve most of the problems that we encountered while trying to develop IE modules with TEG. The most severe of those
was the lack of a clear methodology for building accurate IE modules using the CFG formalism of TEG.
Unsupervised Web Extraction
Information Extraction (IE) is the task of extracting factual assertions from text. Most IE systems rely on knowledge engineering or on machine learning to generate extraction patterns – the mechanism that extracts entities and relation instances from text. In the machine learning approach, a domain expert labels instances of the target relations in a set of documents. The system then learns extraction patterns, which can be applied to new documents automatically.
Both approaches require substantial human effort, particularly when applied to the broad range of documents, entities, and relations on the Web. In order to minimize the manual effort necessary to build Web IE systems, we have designed and implemented SRES (Self-Supervised Relation Extraction System)[6-8]. SRES takes as input the names of the target relations and the types of their arguments. It then uses a large set of unlabeled documents downloaded from the Web in order to learn the extraction patterns.
SRES is most closely related to the KnowItAll system developed at University of Washington by Oren Etzioni and colleagues, since both are unsupervised and both leverage relation-independent extraction patterns to automatically generate seeds, which are then fed into a pattern-learning component. KnowItAll is based on the observation that the Web corpus is highly redundant. Thus, its selective, high-precision extraction patterns readily ignore most sentences, and focus on sentences that indicate the presence of relation instances with very high probability.
In contrast, SRES is based on the observation that, for many relations, the Web corpus has limited redundancy, particularly when one is concerned with less prominent instances of these relations (e.g., the acquisition of Austria Tabak). Thus, SRES utilizes a more expressive extraction pattern language, which enables it to extract information from a broader set of sentences. SRES relies on a sophisticated mechanism to assess its confidence in each extraction, enabling it to sort extracted instances, thereby improving its recall without sacrificing precision.
Our main contributions are as follows:
1. We introduced the first domain-independent system to extract relation instances from the Web with both high precision and high recall.
2. We showed how to minimize the human effort necessary to deploy URES for an arbitrary set of relations, including automatically generating and labeling positive and negative examples of the relation.
3. We performed an experimental comparison between URES and the state-of-the-art KnowItAll system, and showed that URES can double or even triple the recall achieved by KnowItAll for relatively rare relation instances.
Plans for Future Research. This research area is very promising and we expect that it can revolutionaries the whole area of text mining. We have many research goals that we want to achieve within the next 5-10 years.
1. Integrate anaphora resolution into SRES so that we can extract relations that are spread across multiple sentences.
2. Integrate a NER component into SRES (rule based, CRF, RMM) to test how precision can be improved vs. simple NP extraction. [6]
3. Move to a more powerful pattern language that will include more linguistic features. [9]
4. Move from binary predicates to n-ary predicates (such as management change, earning announcements, etc.)
5. Utilize clustering techniques to learn families of relations simultaneously (like family relations, business relations between people)
6. Find techniques to boost the recall of unsupervised web extraction while still maintaining the high precision. [10]
7. Utilize the web to validate the results of SRES [10]
8. Utilize SRES like techniques for classic information extraction
9. Use pattern matching algorithms to reduce the computational complexity of SRES. [10, 11]
Visual Information Extraction
Most information extraction systems simplify the structure of the documents they process by ignoring much of the visual characteristics of the document, e.g. font type, size and location, and process the text as a linear sequence. This allows the algorithms to focus on the semantic aspects of the document. However, valuable information is lost. Consider, for example, an article in a scientific journal. The title is readily recognized based on its special font and location, but less so based on its semantic content, which may be similar to the section headings. Similarly, for the author names, section headings, running title, etc. Thus, much important information is provided by the visual layout of the document. We have developed an information extraction system[12] that is based solely on the visual characteristics of the document, and have shown that this visual information alone is sufficient to provide high accuracy extraction, for specific fields (e.g. the title, author names, publication date, etc.).
We developed a general algorithm which allows to perform the IE task based on the visual layout of the document. The algorithm employs a machine learning approach whereby the system is first provided with a set of training documents in which the desired fields are manually tagged. Based on these training examples the system automatically learns how to find the corresponding fields in future documents.
Problem Formulation. A document D is a set of primitive elements D = {e1,…,en}. A primitive element can be a character, a line, or any other visual object, depending on the document format. A primitive element may have any number of visual attributes, such as font size and type, physical location, etc. The bounding box attribute, which provides the size and location of the bounding box of the element, is assumed to be available for all primitive elements. We define an object in the document to be any set of primitive elements. The Visual Information Extraction (VIE) task is as follows. We are provided with a set of target fields F = {f1,…,fk}, to be extracted, and a set of training documents T = {T1,…,Tm} wherein all occurrences of the target fields are tagged. Specifically, for each target field f and training document T, we are provided with the object f(T) of T that is of type f (f(T) = ; if f does not appear in T). The goal is that when presented with an un-tagged query document Q, to correctly tag the occurrences of the target fields that exist in Q (not all target fields need be present in each document).
Results. We have developed a general framework and algorithm for the VIE task[12, 13]. We have shown that the VIE task can be decomposed into two subtasks. First, for each document (both training and query) we must group the primitive elements into meaningful objects (e.g. lines, paragraphs, etc.), and establish the hierarchical structure among these objects. Then, in the second stage, the structure of the query document is compared with those of the training documents to find the objects corresponding to the target fields. We have also shown how to improve the results by introducing the notion of templates which are groups of training documents with a similar layout (say, articles from the same journal). Using templates we can identify the essential features of the page layout, ignoring particularities of any specific document. We implemented the system for a VIE task on a set of documents containing financial analyst reports. The documents were in PDF format. Target fields included the title, authors, publication dates, and others.
Plans for Future Research. Clearly, our visual approach also has its limitations. First and foremost, the visual approach can only capture fields with distinct visual characteristics, such as the title, authors, publication date, etc. Semantic elements mentioned within the running text, such as people names, locations, etc., clearly cannot be detected by the visual approach. In addition, the learning process that we used only works for features and structures that have a relatively high level of consistency among documents, such as title, author, etc. The method would be less applicable to structures with a high level of variations between documents. Ultimately, we believe that a complete solution for information extraction should make use of the entire spectrum of available information: semantic, syntactic and visual. In such a system, our visual approach would be one of the components in a combined, integrated approach. It is one of our main goals within the next 5 years to develop such a system.
Additional Areas of Active Research
• A visual query language (and efficient query execution engine) for link analysis
• A visual text mining environment [14]
• An integrated system for email mining (using the Enron email repository as a test case)
• Integration of Information Extraction into the Semantic Web framework (OWL, RDF, etc.)
• Information extraction in Hebrew and Arabic (developed for the Israeli MOD)
• Automatic identification and correction of annotation errors (for machine learning based IE)
• Automatic construction of Knowledge Bases from the PubMed database (in collaboration with UPenn's CS Dept and Medical School)
• An IE based search engine (in collaboration with U of Washington, Seattle)
• Temporal Text Mining Environment
• Text Mining of Chat Rooms and Messenger Logs.
Applications of Text Mining to Accounting, Marketing and Stategy
Accounting
My research is focused around analysis of company reports and news articles that report on business events that have an impact of the performance of the companies.
In particular, I am developing classification models that enable to generate a variety of alerts from the analysis of the MD&A section of 10-Ks and 10-Qs. Our goal is to reach an F1 accuracy that is above 90%. Applications of this classification technique are reported in the following papers: [15-17].
Another direction that we pursuing is trying to predict market reaction to various events found in various documents like 10-K and 10-Q [18] and Analyst Recommendations [19].
We have built a system called the Stock Sonar [20] that able to analyze the sentiment of a stock based on the articles and blogs written about it. We have also shown that by utilizing twits about each stock we can build a strategy that would beat the S&P 500 [21].
Marketing
Product discussion boards are a rich source of information about consumer sentiment about products, which is being increasingly exploited. Most sentiment analysis has looked at single products in isolation, but users often compare different products, stating which they like better and why. We developed [22-24] a set of techniques for analyzing how consumers view product markets. Specifically, we extracted relative sentiment analysis and comparisons between products, to understand what attributes users compare products on, and which products they prefer on each dimension.
Strategy
We want to analyze the 10-Q and 10-K reports of public companies and deduce the variability of the product mix of the company and the geographic spread. The analysis will be down automatically using information extraction techniques. Out conjecture is that companies with better product mix and wider geographic spread fare better in difficult economic times.
References
4. Rosenfeld, B., R. Feldman, and F. Moshe. A Systematic Cross-Comparison of Sequence Classifiers. in SDM-06. 2006. Maryland, USA.
12. Rosenfeld, B., R. Feldman, and Y. Aumann. Structural Extraction from Visual Layout of Documents. in CIKM. 2002.
13. Aumann, Y., et al., Visual Information Extraction. Knowledge and Information Systems Journal, 2007. 10(1): p. 1-15.
17. Feldman, R., J. Livnat, and R. Lazar, Earnings Guidance after Regulation FD. The Journal of Investing, 2003.
22. Feldman, R., et al. Extracting Product Comparisons from Discussion Boards. in ICDM-07. 2007.
Honors And Awards
A. Academic Awards and Fellowships
- The Levi Eshkol Fellowship (1993-1995)
- 1992-1993 Soref Post Doctoral Fellowship, Bar-Ilan University.
B. Teaching Awards
- 1985-1986 Outstanding teaching elected by the students in 1985 (Bar-Ilan University)
- 1995-1996 Best Lecturer Award, Michlelet Yosh, Ariel.
C. Scientific Reviews
- Editorial board: - Reviewer in the following Journals: Knowledge Discovery and Data Mining Journal, JIIS, IEEE TKDE, Computational Linguistics |
- Scientific Review at research proposals in:
ISF (Academia), GIF (Germany Israel science Foundation), BSF
- Membership in International Conferences Program Committees:
ACM-SIGKDD, ACM-SIGIR, ACM-WWW, PKDD, IJCAI, ACM-CIKM, IEEE-ICDM, ACL, VLDB, SDM, ECIR
D. Organization of Conferences/ Colloquiums/Sessions
1. Organizing Chair, BISFAI-1993
2. Organizing Chair, IJCAI-1999 Workshop on TEXT MINING: FOUNDATIONS, TECHNIQUES AND APPLICATIONS, Stockholm, Sweden.
3. Organizing Committee, Workshop on Operational Text Classification Systems, SIGIR'2001
Personal Web Site
CV