Contents:
Monitoring RSS / Atom news feeds
Monitoring RSS / Atom feeds
To monitor a new RSS or Atom feed, enter the address of the RSS / Atom feed and click "Add feed" (the address will be a URL, e.g. http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/uk/rss.xml). The feed will be added to the list on the left. You can then define the keywords (see Keyword format) and the refresh interval (how often the feed is checked). The active feeds will be searched automatically at the refresh interval, but you can use the buttons to search individual feeds or all active feeds if you don't want to wait for the refresh interval. When an article is found that contains your keywords, Pheedstorm will pop up a window with the detail and a link the to originating website.
AI / Learning
The more contact I have with humans, the more I learn" - Cyberdyne Systems Model T-101, "Terminator 2"
Simple keyword monitoring isn't always sufficient for informing you about news articles that you are interested in, so the program can learn over time about the kind of articles that you like, and will start to show you similar articles even if they don't contain any of your keywords. It works as follows: each time Pheedstorm shows you a web page it will offer you the chance to rate the page on a scale of "totally irrelevant" to "very relevant". As you indicate your preferences for or against particular types of content, Pheedstorm will start to learn about the kind of stories that interest you and will start to show you other articles it thinks you may find interesting. The more you provide feedback, the more it will learn about what interests you.
To allow you to kickstart the learning process I have put a "Learn" button - when you click this it will show you all articles from the selected feed and ask you to rate them. This can take a while, but the more feedback you provide, the more it will learn. If it starts to show too many or too few articles you can adjust the slider "Show articles with relevance LOW...HIGH" - the nearer to "HIGH" you move the slider position, the fewer articles will be shown as an article will need a very high relevance to trigger an alert. You can repeat the learning process as many times as you like with different feeds - the more feedback you give the program, the more it will learn about your interests. I recommend you run this at least 10 times on a variety of different feeds, e.g. news, entertainment, sport, business, politics etc, so that the program can start to build up a comprehensive database. If it shows you an article you don't like, simply rate it with a negative score and it will update its relevance database accordingly - this will reduce the probability of similar articles being shown.
To start you off I have added some feeds from the BBC website, so to start the program learning, click "Learn" against a number of these feeds (you can delete any feeds you don't like or add ones from other websites).
A brief note on privacy - Pheedstorm will gradually build up a large store of information about the kind of things you find interesting, and this information is stored in the "DATA" directory. If this concerns you, you may want to consider restricting access to this directory, or if you are really paranoid, putting the application on a TrueCrypt drive. Please note that at no point does the program transmit any of this data anywhere else.
Keyword format
For keyword monitoring you define the keywords by entering a number of particular words per line, separated by a comma. An alert is only triggered if all keywords on a line are found. For example:
1:iran,nuclear
Note that only lines that start with "1:" are considered "active" and will be checked - this is so that you can temporarily disable a set of keywords without having to remove them entirely.
You can specify words that must not be found by putting a ! character in front of the word, e.g. suppose you wanted to show articles that contain the words "George" and "Bush" but not "Iraq", you could put "1:george,bush,!iraq
Other configuration
There are some additional configuration options that are defined in the file Pheedstorm.conf, some of the more useful options are described below:
proxyHost=
repeatInterval=10080
forceToForeground=true
showNagScreens=true
runAfterIdleFor=0
verboseOutput=false
processMessagesInterval=10
Common words
Frequently asked questions
0:grand theft auto,game
1:pentagon,hack
proxyPort=
proxyUsername=
proxyPassword=
proxyBasicAuthentication=true
You can use this option to set proxy server configuration, you may need this if you are using the program at work
This is the number of minutes to wait before showing a particular alert again, i.e. if the program has already popped up a message about a particular keyword, you don't want it to show you the same alert 5 minutes later. 10080 is the number of minutes in 1 week which is the default.
If this is set to true, Pheedstorm will always pop up the alert window in front of any applications that are running. If this is set to false, it will simply flash the icon on the taskbar when it needs to alert you.
If this is set to true, Pheedstorm will show a reminder message if you do not provide a relevance rating for at least 50% of the items it shows you. To stop these reminders, set this setting to false.
This setting can be used to try to stop the program taking up lots of CPU and bandwidth while you are working on your PC. If it is set to a non-zero value, the program will only scan the feeds for articles when your PC has been idle for at least this number of seconds (i.e. you have not used the mouse or keyboard). If left at zero the program will not check to see if you are using your PC or not.
The program writes output as it runs to a status panel on each page, this is so you can see what it is doing. If you set this option to true, additional information will be written which can help diagnose problems. Normally you should set this to false as the program will perform a lot faster.
Because the program can take a long time to analyse pages for relevance it can become unresponsive to user events such as mouse clicks (although it will respond eventually). This value specifies (in seconds) how often the program should check for outstanding GUI messages such as mouse clicks. The lower this value, the more frequently the program will respond, but this will increase the time taken to scan articles and web pages.
The file "data\commonwords.dat" contains a list of common words that are not included when checking the relevance of a piece of text. You can add additional words to this file if you like - this can help to stop false-positives, i.e. where a page has a higher relevance rating than it should have due to the presence of common words such as "the", "and", "news", "july", "said", "went" etc..