tag:blogger.com,1999:blog-7583720.post5892649375742754079..comments2024-03-05T03:17:02.289-08:00Comments on Salmon Run: Document Classification using Naive BayesSujit Palhttp://www.blogger.com/profile/06835223352394332155noreply@blogger.comBlogger18125tag:blogger.com,1999:blog-7583720.post-37364787028672140302014-10-22T11:44:31.127-07:002014-10-22T11:44:31.127-07:00Hi, sorry I no longer have the data so can't s...Hi, sorry I no longer have the data so can't share. If you know the topics you want and have some training data, ie, documents that are tagged with the topics, then you could use NB (although I would suggest using one of the standard implementations like Weka or RapidMiner or Scikit-Learn instead of building your own) to classify previously uncategorized documents into one of the classes (or Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-18247367744473461752014-10-22T09:52:54.455-07:002014-10-22T09:52:54.455-07:00Hi, i'm a student a I have a problem: find top...Hi, i'm a student a I have a problem: find topic name with a bag of words.Can I use this for that ?<br />And, can you send me the file you use in your database to vuhoanghiep1993@gmail.com<br />thank you Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-7583720.post-81147513991650274992012-03-28T15:47:04.031-07:002012-03-28T15:47:04.031-07:00Hi Hardik, I have dabbled with C# and Mono when it...Hi Hardik, I have dabbled with C# and Mono when it first came out, but I don't know enough C# to be of any help with converting it. In any case, you may want to start from scratch, classification with Naive Bayes treating your words as a bag-of-words is not too difficult if you have reasonably good (well-polarized) training data. This was written when I wasn't that familiar with NB, so Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-87798686550029199402012-03-28T06:07:30.698-07:002012-03-28T06:07:30.698-07:00Sir
can u please help me to convert the above code...Sir<br />can u please help me to convert the above code in c#<br />it will be very helpfullHardikhttps://www.blogger.com/profile/17152135931375907953noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-92002787988889955042010-10-22T11:51:49.807-07:002010-10-22T11:51:49.807-07:00Hi Reuben, I've been meaning to learn Weka, bu...Hi Reuben, I've been meaning to learn Weka, but haven't yet. I am assuming you just want to use Weka's Naive Bayes library, right? In that case, I am guessing you probably don't need to use jtmt. Best of luck on your project though, it sounds very interesting.Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-12666673235250036882010-10-19T22:36:23.424-07:002010-10-19T22:36:23.424-07:00oohk thanx..
i want to do a project that will cate...oohk thanx..<br />i want to do a project that will categorise studenst articles in uni into busines,science,egineering and phamrcy schools using naive bayes and weka this timeREUBENhttps://www.blogger.com/profile/13047348374476016852noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-51582465719004871742010-10-18T11:56:29.175-07:002010-10-18T11:56:29.175-07:00Hi Reuben, JTMT is not a "real" project,...Hi Reuben, JTMT is not a "real" project, in the sense that it does not aim to provide all the tools you would require to solve a known problem. Its basically just a bunch of code which I wrote to solve some problems that I had, so its kind of difficult to "try" the project. If you find that your problem is solved by some code in JTMT, by all means use it, and if you encounter Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-3821689386635293122010-10-17T03:32:57.199-07:002010-10-17T03:32:57.199-07:00hello sir...
i want to try your project so wil you...hello sir...<br />i want to try your project so wil you mind like to help me get started with it?REUBENhttps://www.blogger.com/profile/13047348374476016852noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-7474381804604830882010-01-21T15:12:06.046-08:002010-01-21T15:12:06.046-08:00Thanks Adil. I wouldn't make too much of my &q...Thanks Adil. I wouldn't make too much of my "teachings" if I were you though - most of this stuff is me trying to learn myself :-).Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-75426803579392303882010-01-10T05:29:45.667-08:002010-01-10T05:29:45.667-08:00Thanks a lot sir. I'll try to make something o...Thanks a lot sir. I'll try to make something out of it. That was a great help and you are great.<br />I'll let you know my results when I'm done.<br />May God be with you and young minds be able to make something out of your teachings.Adilnoreply@blogger.comtag:blogger.com,1999:blog-7583720.post-29688066268049433022010-01-09T19:54:18.122-08:002010-01-09T19:54:18.122-08:00Thanks Adil. Unfortunately, the website which I wa...Thanks Adil. Unfortunately, the website which I was using for my input in here no longer exists - Yahoo! stopped making it free, and I did not think the content was compelling enough to keep around by paying for it :-). If you just need a pre-categorized corpus for testing though, a better choice would be <a href="http://www.daviddlewis.com/resources/testcollections/reuters21578/" rel="nofollow">Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-44371748084026983662010-01-09T06:18:38.584-08:002010-01-09T06:18:38.584-08:00Hello sir
This is really a great help. I am trying...Hello sir<br />This is really a great help. I am trying to simulate your work using weka. The link you provided of your database seems to have no longer exists. Creating another database currently keeps me out of context. Can you please provide a sample dataset of what you used so that I could make myself sure I am getting my implementation correct (tallying with your results).<br />ThanksAdilnoreply@blogger.comtag:blogger.com,1999:blog-7583720.post-38948447491367240412009-08-07T09:57:54.281-07:002009-08-07T09:57:54.281-07:00Thanks, I plan to at some point, haven't gotte...Thanks, I plan to at some point, haven't gotten round to it yet.Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-71938408452211552152009-08-01T16:06:59.166-07:002009-08-01T16:06:59.166-07:00Try Weka :-)Try Weka :-)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-7583720.post-77652909463184371262007-12-22T01:46:00.000-08:002007-12-22T01:46:00.000-08:00Thanks for the kind words, Marcus. To answer your ...Thanks for the kind words, Marcus. To answer your question, while I am no expert on this subject, I think it does not matter what your source language is. As long as the words you use to train your classifier is found in the files you wish to classify, the classification should work.<BR/><BR/>BTW, I took a look at the UK version of your site, nice idea. And I guess the auto-tagger will give Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-24836284273961816592007-12-19T21:50:00.000-08:002007-12-19T21:50:00.000-08:00Thanks man! This was really helpful. I've used the...Thanks man! This was really helpful. I've used the classifier4j some time ago but found it to immature for production. until now.<BR/><BR/>I have a question though about bayesian analysis. Will it classify different languages without problems or is it loaded with an english dictionary os such ?<BR/><BR/>I have a couple of 100k blogs in my database and a couple of million entries <BR/>so far many Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-7583720.post-72310429731229509312007-05-26T18:56:00.000-07:002007-05-26T18:56:00.000-07:00Thanks, I hope you find it interesting (and useful...Thanks, I hope you find it interesting (and useful)... :-).Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-14521572532563957402007-05-25T05:20:00.000-07:002007-05-25T05:20:00.000-07:00Nice stuff. I hope to read it one day (at least I ...Nice stuff. I hope to read it one day (at least I am subscribing your feed blog now).Lukáš Vlčekhttps://www.blogger.com/profile/05996949985105222435noreply@blogger.com