Browser Automation PDF
Browser Automation PDF
Quick access
d.j.t 20 Points
Answers
d.j.t wrote:
Because you said a error occured " Handles clause requires a WithEvents variable defined in the
containing type or one of its base types ". The error has something to do with WithEvents. So that's only
extra reference. You can ignore it.
Come back to the topic: Please drag&drop a Button control named Button1 to your Form.
In this case, you have to click the button to perform the tasks. That's indeed restriction.
OK! Please adopt this idea. Still use WebBrowser1_DocumentCompleted event but add a Boolean avariable
as switch, which can ensure perform the tasks only once.
Code Block
Public Class Form1
Dim march As Boolean ' Set a swith
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
End Sub
End Class
Dominik: "what happens there is (while working fine most of the times), that SOMETIMES the
first table is copied, the one that was displayed when first browsing to the page, before doing
the selections and refreshing. so to me it seems as if the skript doesnt wait for the
0 documentcompleted-event any more. but only sometimes! sometimes the correct table is
Sign also copied, sometimes not. i dont understand this! (actually i never fully understood of the
in to documentcompleted-event-thing). the only way i can explain is that the old computer is to
vote
slow... im frustrated!"
Hi Dominik,
In Part 6 you are extracting the javascript immediately after automatically clicking the More
button without waiting for the next webpage to load with new data:
Code Snippet
1. 'Part 6 Automatically click Continue link
2. Dim hrefElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("a")
3. For Each curElement As HtmlElement In hrefElementCollection
4. Dim controlName As String = curElement.GetAttribute("id").ToString
5. If controlName.Contains("LBtn_More") Then
6. curElement.InvokeMember("Click")
7. End If
8. Next
9. extract()
The code in my first post on this thread fixes that problem. The DocumentCompleted event fires
when a new webpage loads. After clicking the button in Part 4 we have to wait for the next
DocumentCompleted which tells us that next webpage has loaded with new data. Similarly with
clicking the More button in Part 6 (see: http://msdn2.microsoft.com/en-
us/library/system.windows.forms.webbrowser.documentcompleted.aspx):
Code Snippet
1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
2. document_completed = document_completed + 1
3. If document_completed = 1 Then ' First table
4. Part2() ' Automatically select specified option from ComboBox
5. Part3() ' Automatically check the CheckBox
6. Part4() ' Automatically click the Button
7. ElseIf document_completed > 1 And document_completed < 11 Then ' Second
to tenth tables
8. Part5() ' Extract javascript and update last_datetime
9. If last_datetime > earliest_datetime Then
10. Part6() ' Click Continue Button
11. End If
12. End If
13. End Sub
But the If statements need to be refined a bit because DocumentCompleted fires twice per page
(once for the page banner and once for the default page containing the javascript data that we
want):
Code Snippet
1. If (document_completed < 3) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
2. .
3. .
4. .
5. ElseIf (document_completed > 2) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
The second problem is that you are using a 12 hour clock without specifying a.m. or p.m. when
generating the filename so there is potential for overwriting old files or appending new data to an
old file:
Code Snippet
1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")
Code Snippet
1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddHHmmss")
The other bugs I pointed out were "features" that I had introduced myself when converting from
VB to C++ (I was a bit unfamiliar with the Using statement) so you can ignore these.
Edited by Tim Mathias Wednesday, October 14, 2009 6:03 PM Reformatted code snippets.
> Is it exactly necessary to mention e.Url.AbsoluteUri = ... because the url stays the same
througout the whole procedure?
0
Sign It's essential because the url DOESN'T stay the same throughout the whole procedure because the
in to webpage contains a link to a banner page that also calls the procedure after it loads. I've added a
vote
MessageBox to show these two URLs. It's this double message that causes the first table to be
extracted in your skript (i.e. the table we want to ignore).
I've also added an If statement that returns when the banner URL completes (it's a bit neater than
the former If tests I wrote).
Code Snippet
1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
2. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri)
3. If Not (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
4. Return
5. End If
6. document_completed = document_completed + 1
7. If document_completed = 1 Then ' First table
8. Part2() ' Automatically select specified option from ComboBox
9. Part3() ' Automatically check the CheckBox
10. Part4() ' Automatically click the Button
11. ElseIf document_completed > 1 Then
12. Part5() ' Extract javascript and update last_datetime
13. If last_datetime > earliest_datetime Then
14. Part6() ' Automatically click Continue Button
15. Else
16. Me.Close() ' Part 7: Close programme
17. End If
18. End If
19. End Sub
Edited by Tim Mathias Wednesday, October 14, 2009 5:38 PM Reformatted code snippet.
I did originally limit the document_completed count to 10 tables to avoid an infinite repeat in case
there was a problem parsing the DateTime from the webpage (bold red). You'll have the cybercops
after you for a suspected DoS attack.
0
Sign
in to Here's the ultimate bug free code (until you find the next one):
vote
Code Snippet
1. Dim previous_last_datetime As DateTime
2.
3. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
4. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri)
5. If Not (e.Url.AbsoluteUri = seite) Then
6. Return
7. End If
8. document_completed = document_completed + 1
9. If document_completed = 1 Then ' First table
10. Part2() ' Automatically select specified option from ComboBox
11. Part3() ' Automatically check the CheckBox
12. Part4() ' Automatically click the Button
13. ElseIf document_completed > 1 And document_completed < 11 Then
14. previous_last_datetime = last_datetime
15. Part5() ' Extract javascript and update last_datetime
16. If previous_last_datetime > last_datetime Then
17. Part6() ' Automatically click Continue Button
18. Else
19. Me.Close() ' Part 7: Close programme
20. End If
21. End If
22. End Sub
Edited by Tim Mathias Wednesday, October 14, 2009 5:30 PM Reformatted code snippet.
All replies
Hi d.j.t,
http://blogs.charteris.com/blogs/edwardw/archive/2007/07/16/watin-web-application-testing-in-net-
introduction.aspx
http://watin.sourceforge.net/
Basic features:
In order to perform an action against an element you must first obtain a reference to it. This can be
done in 3 different ways:
Regards,
Martin
Hi d.j.t,
Code Block
Public Class Form1
End If
Next
End Class
Best regards,
Martin
d.j.t wrote:
... and copies the new table to a file.
0
Sign
in to
vote To achieve the task, here are two suggestions:
1.
Code Block
Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As
System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
End Sub
You need to Add Reference... -> COM tab -> Find Microsoft CDO For Windows 2000 Library and Microsoft
ActiveX Data Objects 2.5 Library and add them to your project
Code Block
Imports ADODB
Imports CDO
Hi Martin
your first reply is great! Thanks a lot!
0 1. I just have one problem with the first task: when executing, the selection of the combo&checkboxes
Sign works perfectly fine, but the "aktualisieren" button is klicked endlessly. i'd like to stop that. (I used a
in to
vote webbrowser elemet from the toolbox in form1)
If you have an idea how to deal with one of the problems, especially the first, I'd appreciate if you could
post it.
d.j.t 20 Points
Hi d.j.t,
-> You should place the two part code (Automation part and Save page part) into the
WebBrowser1_DocumentCompleted event. Don't name it as WebBrowser1_DocumentCompleted2.
Code Block
Public Class Form1
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)
Handles MyBase.Load
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
End Sub
Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As
System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
'Part 2: Automatically select specified option from ComboBox
Dim theElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("select")
For Each curElement As HtmlElement In theElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then
curElement.SetAttribute("Value", 0)
End If
Next
Dim theWElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
'Part 3: Automatically check the CheckBox
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures"
Then
curElement.SetAttribute("Checked", True)
'Part 4: Automatically click the button
ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")
' javascript has a click method for we need to invoke on the current button element.
End If
Next
' After automatically clicking the button,
' append the following code to save the webpage as htm file
Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")
w.Write(WebBrowser1.Document.Body.InnerHtml)
w.Close()
End Sub
End Class
2. So i tried the second solution (not really knowing what the output will be in that case, maybe more or
less the same), but after renaming the sub still there was the error: " Value of type 'System.Uri' cannot be
converted to 'string' "
-> Please change it to WebBrowser1.Url.ToString. I have modified my third post.
This solution will save entire web page as .mht file which containing all text and images. It seems not to
be what you expect.
3. I just have one problem with the first task: when executing, the selection of the combo&checkboxes
works perfectly fine, but the "aktualisieren" button is klicked endlessly. i'd like to stop that. (I used a
webbrowser elemet from the toolbox in form1)
-> CAUSE: When clicking the button to retrieve data, it refresh and reload current page, so all the time it
0 fires the WebBrowser1_DocumentCompleted event.
Sign
in to Solution: You can place that code in Button1_Click event.
vote
Code Block
Public Class Form1
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)
Handles MyBase.Load
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
End Sub
Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As
System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
MessageBox.Show("Complete loading webpage") ' Optional code
End Sub
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs)
Handles Button1.Click
'Part 2: Automatically select specified option from ComboBox
Dim theElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("select")
For Each curElement As HtmlElement In theElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then
curElement.SetAttribute("Value", 0)
End If
Next
Dim theWElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
'Part 3: Automatically check the CheckBox
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures"
Then
curElement.SetAttribute("Checked", True)
'Part 4: Automatically click the button
ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")
' javascript has a click method for we need to invoke on the current button element.
End If
Next
Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")
w.Write(WebBrowser1.Document.Body.InnerHtml)
w.Close()
End Sub
End Class
4. But I need something that i can easily import to a database, such as .txt (the cellls seperated by tabs and
lines) or .xls.
But if the exported file will be more than the pure table data (as i expect) the problem doesn't really
matter.
-> You need to retrieve that part html code (<Table>...</Table>) containing table data. Here are some
references:
http://www.developer.com/net/csharp/article.php/10918_2230091_2
2) See the Similar issue, you can use Regular Expressions to extract part html code.
I'm glad to hear that you have made enormous progress. Cheers!
Best regards,
Martin
Hi Martin
i tried to use the button1click event but a error occured: " Handles clause requires a WithEvents variable
defined in the containing type or one of its base types "
Nevertheless, when excuting it, the same endless clicking of the refreshbutton happened...
0 Thanks for your efforts!
Sign Dominik
in to
vote
Wednesday, December 5, 2007 9:56 AM
d.j.t 20 Points
d.j.t 20 Points
d.j.t wrote:
Hi Martin
i tried to use the button1click event but a error occured: " Handles clause requires a
0
WithEvents variable defined in the containing type or one of its base types "
Sign
in to
vote
Please directly drag&drop a Button control named Button1 to your Form.
http://msdn2.microsoft.com/en-us/library/aty3352y(VS.80).aspx
Specifies that one or more declared member variables refer to an instance of a class that can raise events.
Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser
comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox.
Well I could have known it had something to do with a button on the form... sorry :-/
But now im really confuesed... cause now i have to click the button to perform the tasks.
And I'am not sure what you want to tell me with:
Well is there a possibility to solve that problem of the repetition by adding something like the following (in plain
english) to the code you first recommended?
"and if value of the combobox is not equal to 0?"
d.j.t 20 Points
d.j.t wrote:
Because you said a error occured " Handles clause requires a WithEvents variable defined in the
containing type or one of its base types ". The error has something to do with WithEvents. So that's only
extra reference. You can ignore it.
Come back to the topic: Please drag&drop a Button control named Button1 to your Form.
In this case, you have to click the button to perform the tasks. That's indeed restriction.
OK! Please adopt this idea. Still use WebBrowser1_DocumentCompleted event but add a Boolean avariable
as switch, which can ensure perform the tasks only once.
Code Block
Public Class Form1
Dim march As Boolean ' Set a swith
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
End Sub
End Class
Thank you! Thats exactly what i was trying to do (but lack of experience prevened me from doing so)! First
task acomplished!
So there remains the second task of extracting the table... even though - after you helped me so much -
0 i'm a bit embarressed to ask, did you see my questions concerning your links (regarding extraction)
Sign (Tuesday, 10:32 PM)?
in to
vote
d.j.t 20 Points
d.j.t wrote:
i'm just working on the extraction.
- the first link is related to c# ... can i just change the language?
0 - the similar issue seems to be excactly what i want but there is no complete
Sign code provided
in to - the regular expressions thing - i appologize for this noob question - what is
vote
that?
dominik
Suggest posting this task to Regular Expressions forum for quicker and better responses.
Also point out the Table where you want to extract data as below:
Code Block
<TABLE cellSpacing=0 cellPadding=0 width="100%" border=0>
<TBODY>
<TR>
<TR>
<TH class=wp1-header>Datum</TH>
<TH class=wp1-header>Eröffnung</TH>
<TH class=wp1-header>Hoch</TH>
<TH class=wp1-header>Tief</TH>
<TH class=wp1-header>Schluss</TH>
<TH class=wp1-header>Volumen</TH></TR>
<TR>
<TR>
<TR>
<TR>
<TR>
<TR>
<TR>
<TR>
<TR>
<TR>
<TR>
<TR>
<TR>
</TR>
</TBODY>
</TABLE>
By the way, convert C# code to VB.NET code by means of this Code Translator tool.
Hi Martin!
Well there is one last question (even though others might follow:-) that fits in this topic: How do i click the
"weiter" button at the bottom of the table? I tried to do it the same way as clicking "refresh":
0 _________________________________________________________________________
Sign in Dim theWElementCollection As HtmlElementCollection =
to vote WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
I tried to find the TagName and the attribute for the "weiter" link but it didnt work with what i found: "a"
instead of "input" and "id" instead of "name"
</td>
<td align="right"><a id="ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" class="wp1-more" hre
</tr>
Once more I hope you can provide help.
Thanks Dominik
d.j.t 20 Points
Please check part 5: Automatically click Continue link. ("weiter" is translated to "Continue")
Code Block
0 Public Class Form1
Sign in Dim march As Boolean ' Set a swith
to vote
march = False ' If accomplish the task, change the switch to False.
Else ' If march = False, don't need to perform above tasks, directly continue to
click "Continue" link.
'Part 5: Automatically click Continue link
Dim hrefElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("a")
For Each curElement As HtmlElement In hrefElementCollection
Dim controlName As String = curElement.GetAttribute("id").ToString
If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then
curElement.InvokeMember("Click")
End If
Next
End If
End Sub
End Class
Hi Martin,
thanks for the reference to the other forum, it was quite useful: somebody there could provide assistance!
Like this
i'd be fine if the x could be a variable, selected in a form when starting the programm. but that should be
rather easy then.
Dominik
d.j.t 20 Points
Hi with that code - thanks for it - the repetition in the end is happening again. I introduced a second
switch and changed the final part to avoid this:
Else
0
Sign in If marchb = True Then
to vote
curElement.InvokeMember("Click")
End If
Next
End If
End If
End Class
d.j.t 20 Points
Hi martin,
-at the reg.ex. forum i was provided a lot of help but one Problem remains: I inserted the extraction where i
had planed it, but it seems it happens to fast: the extracted table is the one displayed before refreshing. I
0 hoped a few seconds pausing or another switch after the new table is completely loaded should do the
Sign in
to vote
trick, but my attempts have not been successfull yet.
-And another little thing: up to now the extracted table is saved to a "fix-named" file. as this programm will
run often, i'd like to have a changing date component and (for several pages a day) a counter in the
filename.
Hi ok now i am puzzled once more: i finally tried the exporting but it did export the first table, the table
that is displayed before the selection from the comboboxes is done. (but i need the table that is
displayed after the comboboxselection). whats wrong? please have a look at my complete code. Thank
you:
Imports System.IO
Imports System.Text.RegularExpressions
marchb = True
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
End Sub
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step"
Then
curElement.SetAttribute("Value", 0)
End If
Next
If controlName =
"ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Lines" Then
curElement.SetAttribute("Value", 100)
End If
Next
If controlName =
"ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then
curElement.SetAttribute("Checked", True)
ElseIf controlName =
"ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")
End If
Next
'part 5 export
'java skript
Next
march = False ' If accomplish the task, change the switch to False.
lastDate = Nothing
Using sw As StreamWriter =
File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export.txt")
sw.WriteLine(String.Join(separator, row))
lastDateStr = row(0)
Next
End Using
lastDate = DateTime.Parse(lastDateStr)
End If
Else ' If march = False, don't need to perform above tasks, directly
click Continue link.
If controlName =
"ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then
curElement.InvokeMember("Click")
End If
marchb = False
Next
End If
End If
End Sub
End Class
d.j.t 20 Points
Hi d.j.t,
Welcome back!
0
Sign in
to vote I'm glad to hear that you got much help from Regular Expressions forum.
"but it seems it happens to fast: the extracted table is the one displayed before refreshing."
-> 'Delay 2 seconds
System.Threading.Thread.Sleep(2000)
ExportTableData()
"And another little thing: up to now the extracted table is saved to a "fix-named" file. as this programm will
run often, i'd like to have a changing date component and (for several pages a day) a counter in the
filename."
This is complete code. The modified parts are marked in bold font.
Code Block
Imports System.IO
Imports System.Text.RegularExpressions
1 Public Class Form1
Sign in
to vote
Dim lastDate As DateTime
Dim marchb As Boolean
Dim march As Boolean ' Set a switch
'Delay 2 seconds
System.Threading.Thread.Sleep(2000)
'Call sub to extract
ExportTableData()
Else ' If march = False, don't need to perform above tasks, directly click Continue link.
If marchb = True And lastDate = Today.AddDays(1) Then ' something like that - dont
think that already works
'Delay 2 seconds
System.Threading.Thread.Sleep(2000)
'Call sub to extract again
ExportTableData()
End If
marchb = False
Next
End If
End If
End Sub
' To be continue...
Code Block
' Continue
' I put extract function code in custom method in order to be called conveniently.
1 Public Sub ExportTableData()
Sign in 'part 5 export
to vote
'java script
Dim rows As New System.Collections.ObjectModel.Collection(Of String())()
Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+([^\\]+(?
=\\r\\n'))"
For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)
rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))
Next
End Class
Thanks for all those answers!!!! Just Great! i hope that with this i can finally finish my task! Loads of thanks!
Hi Martin,
finally i have a complete working code doing exactly what i want. Big thanks to you! i have some questions
still but they are mere "cosmetics".
0
Sign in -With that code the first table is copied twice. I dont really understand why...
to vote
-Can it easyly be done, that the user doesnt notice anything else of the execution of the skript once it is
executed. I mean no window, no sounds...
-I'd like that programm to be used not only for one stock, but for several (up to 100). So i could just
change the adress in the first sub and create a executable programm for each stock. Then write few lines
that make all those programms be executed. I think this should even be possible at the same time.??.
Well of course i'd would be more elegant if i didnt need to create so many single programms . is there an
conviniently easy way to do this in the skipt?
Thanks! Dominik
Ps: Skript in next post... cant post it in color... (dont ask me why, the forum always refuses to accept
(unknown error))
d.j.t 20 Points
Imports System.IO
Imports System.Text.RegularExpressions
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=EAD.ETR")
curElement.SetAttribute("Value", 0)
End If
Next
End If
Next
curElement.InvokeMember("click")
march = False ' If accomplish the task, change the switch to False.
End If
Next
Else
If marchc = True And march = False Then ' If march = False, don't need to perform above tasks,
directly click Continue link.
End If
End If
If marchc = False And lastDate > Today.AddDays(-2) Then ' im not sure if that works
curElement.InvokeMember("Click")
End If
Next
extract()
'ElseIf lastDate > "01.01.0001" And lastDate < Today.AddDays(-2) Then : Close() 'just good to
know...
End If
Next
Next
End Class
d.j.t 20 Points
Try this:
0 Code Snippet
Sign in
to vote
1. Public Class Form1
2. Dim document_completed As Integer
3. Dim last_datetime As DateTime
4. Dim earliest_datetime As DateTime
5. Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles MyBase.Load
6. WebBrowser1.Dock = DockStyle.Fill
7. Me.WindowState = FormWindowState.Maximized
8. Part1() ' Use WebBrowser control to load web page
9. End Sub
10. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
11. document_completed = document_completed + 1
12. If document_completed = 1 Then ' First table
13. Part2() ' Automatically select specified option from ComboBox
14. Part3() ' Automatically check the CheckBox
15. Part4() ' Automatically click the Button
16. ElseIf document_completed > 1 And document_completed < 11 Then '
Second to tenth tables
17. Part5() ' Extract javascript and update last_datetime
18. If last_datetime > earliest_datetime Then
19. Part6() ' Click Continue Button
20. End If
21. End If
22. End Sub
23. Private Sub Part1()
24. ' Part 1: Use WebBrowser control to load web page
25. document_completed = 0
26. last_datetime = DateTime.Now
27. earliest_datetime = last_datetime.AddDays(-2)
28. WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
29. End Sub
30. Private Sub Part2()
31. ' Part 2: Automatically select specified option from ComboBox
32. End Sub
33. Private Sub Part3()
34. ' Part 3: Automatically check the CheckBox
35. End Sub
36. Private Sub Part4()
37. ' Part 4: Automatically click the Button
38. End Sub
39. Private Sub Part5()
40. ' Part 5: Extract javascript and update last_datetime
41. End Sub
42. Private Sub Part6()
43. ' Part 6: Click Continue Button
44. End Sub
45. End Class
Edited by Tim Mathias Wednesday, October 14, 2009 6:25 PM Reformatted code snippet.
Code Snippet
0 1. If last_datetime > earliest_datetime Then
Sign in 2. Part6() ' Click Continue Button
to vote 3. Else
4. Me.Close() ' Part 7: Close programme
5. End If
Edited by Tim Mathias Wednesday, October 14, 2009 6:10 PM Reformatted code snippet.
Hi Dominik,
I found a couple of bugs in Part 5 when I tried it out in C++ (I'm a C++ man not a VB one). I've
highlighted the important changes in bold (namely -- 24 hour clock, closed the output file
0 immediately after writing to it, and parsing a 15 character substring for the last datetime). (I've
Sign in also used GetElementById to get straight to the point.)
to vote
With the original version, ParseExact threw an exception every time, leaving the output file open
and empty. Maybe this is what is causing you stability issues with VB.
Code Snippet
1. void Part1 ()
2. {
3. Trace::WriteLine ("Part 1");
4.
5. // Part 1: Use WebBrowser control to load web page
6. document_completed = 0;
7. last_datetime = DateTime::Now;
8. earliest_datetime = last_datetime.AddDays (-2.0);
9. webBrowser1->DocumentCompleted += gcnew
WebBrowserDocumentCompletedEventHandler (this, &Form1::DocumentCompleted);
10. webBrowser1->Navigate ("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX");
11. }
12.
13. void Part2 ()
14. {
15. Trace::WriteLine ("Part 2");
16.
17. // Part 2: Automatically select specified option from ComboBox
18. HtmlElement ^el = webBrowser1->Document->GetElementById
("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_DD_Step");
19. el->SetAttribute ("value", "0");
20. }
21.
22. void Part3 ()
23. {
24. Trace::WriteLine ("Part 3");
25.
26. // Part 3: Automatically check the CheckBox
27. HtmlElement ^el = webBrowser1->Document->GetElementById
("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_CBx_CapitalMeasures");
28. el->SetAttribute ("checked", "true");
29. }
30.
31. void Part4 ()
32. {
33. Trace::WriteLine ("Part 4");
34.
35. // Part 4: Automatically click the button
36. HtmlElement ^el = webBrowser1->Document->GetElementById
("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_IBtn_Refresh1");
37. el->InvokeMember ("click");
38. }
39.
40. void Part5 ()
41. {
42. Trace::WriteLine ("Part 5");
43.
44. // Part 5: Extract javascript and update last_datetime
45. try
46. {
47. ArrayList ^rows = gcnew ArrayList ();;
48. Regex ^pattern = gcnew Regex ("(?<=myl\\+=\\')([^\\\\]+(?:\\\\t))+
([^\\\\]+(?=\\\\r\\\\n'))");
49. Trace::WriteLine ("Part 5: pattern = " + pattern);
50. MatchCollection ^matches = pattern->Matches (webBrowser1-
>DocumentText);
51. Trace::WriteLine ("Part 5: matches->Count = " + matches->Count);
52. array <String^> ^tab = { gcnew String ("\\t") };
53. for (int i = 0; i < matches->Count; i++)
54. {
55. Trace::WriteLine (matches [i]->Value);
56. rows->Add (String::Join ("\t", matches [i]->Value->Split (tab,
StringSplitOptions::None)));
57. Trace::WriteLine (rows [i]);
58. }
59. String ^current_datetime = DateTime::Now.ToString ("yyyyMMddHHmmss");
// 24 hour clock
60. StreamWriter ^file = gcnew StreamWriter ("BrowserAutomation" +
current_datetime + ".txt");
61. for (int i = 0; i < rows->Count; i++)
62. {
63. file->WriteLine (rows [i]);
64. }
65. file->Close ();
66.
67. String ^str_last_datetime = (String ^) rows [rows->Count - 1];
68. Trace::WriteLine ("str_last_datetime = " + str_last_datetime);
69. last_datetime = DateTime::ParseExact (str_last_datetime->Substring
(0, 15), "dd.MM. HH:mm:ss",
System::Globalization::CultureInfo::CreateSpecificCulture ("de-de"));
70. Trace::WriteLine ("last_datetime = " + last_datetime);
71. }
72. catch (Exception ^e)
73. {
74. Trace::WriteLine ("Part 5: " + e->Message);
75. }
76. }
77.
78. void Part6 ()
79. {
80. Trace::WriteLine ("Part 6");
81.
82. // Part 6: Click Continue Button
83. HtmlElement ^el = webBrowser1->Document->GetElementById
("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_LBtn_More");
84. el->InvokeMember ("click");
85. }
Edited by Tim Mathias Wednesday, October 14, 2009 6:20 PM Reformatted code snippet.
Hi
thanks for your posts but as this is my first skript and therefore my programming experience is near zero, i
dont know how i would have to translate your skript to vb.net. or do you propose to change to c++? well
0 i've only used vb.net up to now.
Sign in
to vote
nevertheless i made some changes within my code (namely i put: add.days(-1) everywhere where i had
different numbers before) and now it seems to work.
well this programm is supposed to run on an old win2000sp4 computer that is not used for anything else,
so nobody can interfere. but after all was working fine on the (more or less new) win xpcomputer, on which
i wrote the whole thing, it is not working that fine on the old win2000sp4 computer.
what happens there is (while working fine most of the times), that SOMETIMES the first table is copied, the
one that was displayed when first browsing to the page, before doing the selections and refreshing. so to
me it seems as if the skript doesnt wait for the documentcompleted-event any more. but only sometimes!
sometimes the correct table is also copied, sometimes not. i dont understand this! (actually i never fully
understood of the documentcompleted-event-thing). the only way i can explain is that the old computer is
to slow... im frustrated!
Thanks Dominik
d.j.t 20 Points
Imports System.IO
0 Imports System.Text.RegularExpressions
Sign in
to vote
Me.Visible = False
marchc = True
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=SAP.ETR")
End Sub
If controlName.contains("DD_Step") Then
curElement.SetAttribute("Value", 0)
End If
Next
If controlName.contains("DD_Lines") Then
curElement.SetAttribute("Value", 100)
End If
Next
If controlName.contains("CBx_CapitalMeasures") Then
curElement.SetAttribute("Checked", True)
curElement.InvokeMember("click")
End If
Next
Else
If marchc = True And march = False Then ' If march = False, don't need to perform
above tasks, directly click Continue link.
'part 5 export
extract()
marchc = False
End If
End If
If marchc = False And lastDate > Today.AddDays(-1) Then ' im not sure if that
works
If controlName.Contains("LBtn_More") Then
curElement.InvokeMember("Click")
End If
Next
extract()
Me.Close()
End If
End Sub
'sub to extract
Next
lastDate = Nothing
sw.WriteLine(String.Join(separator, row))
lastDateStr = row(0)
Next
End Using
System.Threading.Thread.Sleep(2000)
End If
End Sub
End Class
d.j.t 20 Points
Dominik: "what happens there is (while working fine most of the times), that SOMETIMES the
first table is copied, the one that was displayed when first browsing to the page, before doing
the selections and refreshing. so to me it seems as if the skript doesnt wait for the
0 documentcompleted-event any more. but only sometimes! sometimes the correct table is
Sign in also copied, sometimes not. i dont understand this! (actually i never fully understood of the
to vote
documentcompleted-event-thing). the only way i can explain is that the old computer is to
slow... im frustrated!"
Hi Dominik,
In Part 6 you are extracting the javascript immediately after automatically clicking the More
button without waiting for the next webpage to load with new data:
Code Snippet
1. 'Part 6 Automatically click Continue link
2. Dim hrefElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("a")
3. For Each curElement As HtmlElement In hrefElementCollection
4. Dim controlName As String = curElement.GetAttribute("id").ToString
5. If controlName.Contains("LBtn_More") Then
6. curElement.InvokeMember("Click")
7. End If
8. Next
9. extract()
The code in my first post on this thread fixes that problem. The DocumentCompleted event fires
when a new webpage loads. After clicking the button in Part 4 we have to wait for the next
DocumentCompleted which tells us that next webpage has loaded with new data. Similarly with
clicking the More button in Part 6 (see: http://msdn2.microsoft.com/en-
us/library/system.windows.forms.webbrowser.documentcompleted.aspx):
Code Snippet
1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
2. document_completed = document_completed + 1
3. If document_completed = 1 Then ' First table
4. Part2() ' Automatically select specified option from ComboBox
5. Part3() ' Automatically check the CheckBox
6. Part4() ' Automatically click the Button
7. ElseIf document_completed > 1 And document_completed < 11 Then ' Second
to tenth tables
8. Part5() ' Extract javascript and update last_datetime
9. If last_datetime > earliest_datetime Then
10. Part6() ' Click Continue Button
11. End If
12. End If
13. End Sub
But the If statements need to be refined a bit because DocumentCompleted fires twice per page
(once for the page banner and once for the default page containing the javascript data that we
want):
Code Snippet
1. If (document_completed < 3) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
2. .
3. .
4. .
5. ElseIf (document_completed > 2) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
The second problem is that you are using a 12 hour clock without specifying a.m. or p.m. when
generating the filename so there is potential for overwriting old files or appending new data to an
old file:
Code Snippet
1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")
Code Snippet
1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddHHmmss")
The other bugs I pointed out were "features" that I had introduced myself when converting from
VB to C++ (I was a bit unfamiliar with the Using statement) so you can ignore these.
Edited by Tim Mathias Wednesday, October 14, 2009 6:03 PM Reformatted code snippets.
Hi Tim,
thanks for your comprehensive explanations! I think with the structure you are adviceing it should work a
lot better than what i had before.
one thing i still dont understand is why my skript not only extracts the "old table" but also the new one...
0 well but that doesnt matter.
Sign in
to vote
First i wondered whether this would allow not more then 10 tables
ElseIf document_completed > 1 And document_completed < 11 Then ' Second to tenth tables
But i see this part needs to be changed to what you wrote so this restriction drops out:
ElseIf (document_completed > 2) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
Is it exactly necessary to mention e.Url.AbsoluteUri = ... because the url stays the same througout the
whole procedure?
Well, as i am doing this while studying i cant implement all your advices right now, but i'll do so soon and report
my progress!
d.j.t 20 Points
Hi i just tried it, works fine! Just the me.close part is missing but no time left now, will continue next fryday.
Thanks a lot!!!!!! Dominik
> Is it exactly necessary to mention e.Url.AbsoluteUri = ... because the url stays the same
througout the whole procedure?
0
Sign in
It's essential because the url DOESN'T stay the same throughout the whole procedure because the
to vote webpage contains a link to a banner page that also calls the procedure after it loads. I've added a
MessageBox to show these two URLs. It's this double message that causes the first table to be
extracted in your skript (i.e. the table we want to ignore).
I've also added an If statement that returns when the banner URL completes (it's a bit neater than
the former If tests I wrote).
Code Snippet
1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
2. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri)
3. If Not (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
4. Return
5. End If
6. document_completed = document_completed + 1
7. If document_completed = 1 Then ' First table
8. Part2() ' Automatically select specified option from ComboBox
9. Part3() ' Automatically check the CheckBox
10. Part4() ' Automatically click the Button
11. ElseIf document_completed > 1 Then
12. Part5() ' Extract javascript and update last_datetime
13. If last_datetime > earliest_datetime Then
14. Part6() ' Automatically click Continue Button
15. Else
16. Me.Close() ' Part 7: Close programme
17. End If
18. End If
19. End Sub
Edited by Tim Mathias Wednesday, October 14, 2009 5:38 PM Reformatted code snippet.
I'll test this skipt, but i think still there is one problem:
0
Sign in If the last date in the table is yesterday, the scipt will click "more/next table"("weiter") to get the next table.
to vote
Now sometimes there is no futher information [because the intraday-data i need is saved for only 5 days
or so]. Then when clicking on "more/next table" the same table is loaded again, as there is no next table. In
that case the program will endlessly repeat the re-loading and extraction of that table.
[With my data this is extremely unlikely to happen, but it happend for the first time in 2 weeks yesterday so
i got the same file a thousand times and the skript (the former one) ran for like 12 hours until it crashed].
What i thought of to solve this problem was to save the lastdate for one turn so that the next time we can
compare if the last date has changed. So we need the lastdate of the previous and the pre-previous table.
It can probably be done easier. So don't continue reading if you have an easy solution.
document_completed = document_completed + 1
End Sub
But anyway,m y idea was therefore to save the lastdate every second time into a new variable. my idea was
to determine if it is the second time by counting the docment_completed events: i understand we get this
event 4 times whithin 2 turns .
So here the code... just didnt know how to determine if a variable is an integer...
...
ElseIf (document_completed > 2) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then ' Second to xth tables
Part5() ' Extract javascript and update last_datetime
If lastdate > earliest_datetime And document_completed / 4 gives an
integer and checkdate2 <> lastdate Then
Part6() ' Click Continue Button
ElseIf lastdate > earliest_datetime And document_completed / 4 does not
give an integer and checkdate1 <> lastdate Then
Part6() ' Click Continue Button
Else
Me.Close() ' Part 7: Close programme
End If
End If
...
d.j.t 20 Points
I did originally limit the document_completed count to 10 tables to avoid an infinite repeat in case
there was a problem parsing the DateTime from the webpage (bold red). You'll have the cybercops
after you for a suspected DoS attack.
0
Sign in
to vote Here's the ultimate bug free code (until you find the next one):
Code Snippet
1. Dim previous_last_datetime As DateTime
2.
3. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
4. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri)
5. If Not (e.Url.AbsoluteUri = seite) Then
6. Return
7. End If
8. document_completed = document_completed + 1
9. If document_completed = 1 Then ' First table
10. Part2() ' Automatically select specified option from ComboBox
11. Part3() ' Automatically check the CheckBox
12. Part4() ' Automatically click the Button
13. ElseIf document_completed > 1 And document_completed < 11 Then
14. previous_last_datetime = last_datetime
15. Part5() ' Extract javascript and update last_datetime
16. If previous_last_datetime > last_datetime Then
17. Part6() ' Automatically click Continue Button
18. Else
19. Me.Close() ' Part 7: Close programme
20. End If
21. End If
22. End Sub
Edited by Tim Mathias Wednesday, October 14, 2009 5:30 PM Reformatted code snippet.
I've had a deeper look at the website's pagination problem. I've separated the reading of the table
rows from the writing of the table rows -- Part5A and Part5B. I've also added a new variable --
1 more_data -- to test whether the next table is really more data or just a repeat of the last table. If
Sign in you want you can also add a time limit to this test -- earliest_datetime -- as we had before.
to vote
Currently (at time of writing this post) there's still a mysterious problem with that particular website
with a double entry:
If you select 20 lines per page the latter of these entries disappears.
Code Snippet
1. Imports System.IO
2. Imports System.Text.RegularExpressions
3.
4. Public Class Form1
5.
6. Dim seite As Uri
7. Dim document_completed As Integer
8. Dim last_datetime As DateTime
9. Dim rows As ArrayList
10. Dim more_data As Boolean
11.
12. Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles MyBase.Load
13. Trace.WriteLine(vbCrLf & vbCrLf & "Form1_Load")
14. Me.WindowState = FormWindowState.Maximized
15. Part1() ' Use WebBrowser control to load web page
16. End Sub
17.
18. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs)
Handles WebBrowser1.DocumentCompleted
19. Trace.WriteLine(vbCrLf & "WebBrowser1_DocumentCompleted url = " &
e.Url.ToString)
20. If (e.Url <> seite) Then
21. Return ' Ignore banner page load
22. End If
23. document_completed = document_completed + 1
24. Trace.WriteLine(vbCrLf & "document_completed = " &
document_completed & vbCrLf)
25. If document_completed = 1 Then ' First table
26. Trace.WriteLine(vbCrLf & "Section A" & vbCrLf)
27. Part2() ' Automatically select specified options from ComboBoxes
28. Part3() ' Automatically check the CheckBox
29. Part4() ' Automatically click the Button
30. ElseIf more_data And document_completed < 11 Then
31. Trace.WriteLine(vbCrLf & "Section B" & vbCrLf)
32. Part5A() ' Read javascript table rows and update more_data
33. If more_data Then
34. Part6() ' Automatically click More Button
35. Else
36. Part5B() ' Write combined table rows to file
37. Close() ' Part 7: Close programme
38. End If
39. Else
40. Trace.WriteLine("Too many tables.")
41. Part5B() ' Write combined table rows to file
42. Close() ' Part 7: Close programme
43. End If
44. End Sub
45.
46. Private Sub Part1()
47. ' Part 1: Use WebBrowser control to load web page
48. Trace.WriteLine("Part1: Use WebBrowser control to load web page")
49. seite = New Uri("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
50. document_completed = 0
51. last_datetime = DateTime.Now
52. rows = New ArrayList
53. more_data = True
54. WebBrowser1.Dock = DockStyle.Fill
55. WebBrowser1.Navigate(seite)
56. End Sub
57.
58. Private Sub Part2()
59. ' Part 2: Automatically select specified options from ComboBoxes
60. Trace.WriteLine("Part2: Automatically select specified options from
ComboBoxes")
61. Try
62. ' Part 2A: Times & Sales
63. Dim el1 As HtmlElement =
WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04
_DD_Step")
64. el1.SetAttribute("value", "0")
65.
66. ' Part 2B: 100 lines
67. Dim el2 As HtmlElement =
WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04
_DD_Lines")
68. el2.SetAttribute("value", "100")
69. Catch e As Exception
70. Trace.WriteLine("ERROR: Part2: " & e.Message)
71. Close()
72. End Try
73. End Sub
74.
75. Private Sub Part3()
76. ' Part 3: Automatically check the CheckBox
77. Trace.WriteLine("Part3: Automatically check the CheckBox")
78. Try
79. Dim el As HtmlElement =
WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04
_CBx_CapitalMeasures")
80. el.SetAttribute("checked", "true")
81. Catch e As Exception
82. Trace.WriteLine("ERROR: Part3: " & e.Message)
83. Close()
84. End Try
85. End Sub
86.
87. Private Sub Part4()
88. ' Part 4: Automatically click the Button
89. Trace.WriteLine("Part4: Automatically click the Button")
90. Try
91. Dim el As HtmlElement =
WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04
_IBtn_Refresh1")
92. el.InvokeMember("click")
93. Catch e As Exception
94. Trace.WriteLine("ERROR: Part4: " & e.Message)
95. Close()
96. End Try
97. End Sub
98.
99. Private Sub Part5A()
100. ' Part 5A: Read javascript table rows and update more_data
101. Trace.WriteLine("Part5A: Read javascript table rows and update
more_data")
102. Try
103. Dim new_rows As New ArrayList
104. Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")
([^\\]+(?:\\t))+([^\\]+(?=\\r\\n'))"
105. Dim separator As String = vbTab
106. For Each m As Match In Regex.Matches(WebBrowser1.DocumentText,
pattern)
107. new_rows.Add(String.Join(separator, m.Value.Split(New
String() {"\t"}, StringSplitOptions.None)))
108. Trace.WriteLine(new_rows(new_rows.Count - 1))
109. Next
110. Dim str_new_last_datetime As String = new_rows(new_rows.Count -
1)
111. Dim new_last_datetime As DateTime
112. new_last_datetime =
DateTime.ParseExact(str_new_last_datetime.Substring(0, 15), "dd.MM.
HH:mm:ss", System.Globalization.CultureInfo.CreateSpecificCulture("de-de"))
113. If (new_last_datetime < last_datetime) Then
114. Trace.WriteLine("Adding " & new_rows.Count & " new row(s) to
combined rows.")
115. rows.AddRange(new_rows)
116. last_datetime = new_last_datetime
117. Else
118. Trace.WriteLine("Skipping new row(s).")
119. more_data = False
120. End If
121. Catch e As Exception
122. Trace.WriteLine("ERROR: Part5A: " & e.Message)
123. Part5B() ' Save any accrued data
124. Close()
125. End Try
126. End Sub
127.
128. Private Sub Part5B()
129. ' Part 5B: Write combined table rows to file
130. Trace.WriteLine("Part5B: Write combined table rows to file")
131. If rows.Count Then
132. Try
133. Dim current_datetime As String =
DateTime.Now.ToString("yyyyMMddHHmmss") ' 24 hour clock
134. Trace.WriteLine("Writing " & rows.Count & " row(s) to
file...")
135. Using sw As StreamWriter =
File.CreateText("BrowserAutomation" & current_datetime & ".txt")
136. For Each row As String In rows
137. sw.WriteLine(row)
138. Next
139. End Using
140. Trace.WriteLine("Done.")
141. Catch e As Exception
142. Trace.WriteLine("ERROR: Part5B: " & e.Message)
143. Close()
144. End Try
145. Else
146. Trace.WriteLine("No data to write.")
147. End If
148. End Sub
149.
150. Private Sub Part6()
151. ' Part 6: Automatically click More Button
152. Trace.WriteLine("Part 6: Automatically click More Button")
153. System.Threading.Thread.Sleep(2000)
154. Try
155. Dim el As HtmlElement =
WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04
_LBtn_More")
156. el.InvokeMember("click")
157. Catch e As Exception
158. Trace.WriteLine("ERROR: Part4: " & e.Message)
159. Part5B() ' Save any accrued data
160. Close()
161. End Try
162. End Sub
163.
164. End Class
Edited by Tim Mathias Wednesday, October 14, 2009 5:22 PM Reformatted code snippet.
Hi Tim
thanks for both your posts! I implemented the first post and it did work.
0 The missing lines on the website - we probably cant do anything about that but that shouldt matter i hope.
Sign in
to vote
Now your second post looks really scaring. There commands you use are totally different! Id like to
understand all that, but at the moment i just have no time as i am studying and exams are held next week
and then i'll be away for a while.
But thanks anyway! Should what i have yet not work i'll check it out!
Dominik
d.j.t 20 Points
Hello d.j.t,
Considering that many developers in this forum ask how to automate a web page via WebBrowser, rotate or flip images, my
team has created a code sample for this frequently asked programming task in Microso All-In-One Code Framework. You
0 can download the code samples at:
Sign in
to vote
VBWebBrowserAutomation
http://bit.ly/VBWebBrowserAutomation
CSWebBrowserAutomation
http://bit.ly/CSWebBrowserAutomation
With these code samples, we hope to reduce developers’ efforts in solving the frequently asked
programming tasks. If you have any feedback or sugges ons for the code samples, please email us: onecode@microso .com.
------------
The Microso All-In-One Code Framework (h p://1code.codeplex.com) is a free, centralized code sample library driven by
developers' needs. Our goal is to provide typical code samples for all Microso development technologies, and reduce
developers' efforts in solving typical programming tasks.
Our team listens to developers’ pains in MSDN forums, social media and various developer communi es. We write code
samples based on developers’ frequently asked programming tasks, and allow developers to download them with a short
code sample publishing cycle. Addi onally, our team offers a free code sample request service. This service is a proac ve way
for our developer community to obtain code samples for certain programming tasks directly from Microso .
Thanks
Visual Studio
Programs
Microsoft Azure BizSpark (for startups)
Microsoft Imagine (for students)
More...
United States (English) Newsletter Privacy & cookies Terms of use Trademarks © 2019 Microsoft