HTML Table > database entry
Hi.
Is there an easy way to "lift" data from HTML tables and enter that into
my database? I'm a total novice and so far my searches have yielded
little. I see Navicat has an import option, but that appears to be for
well structured data like Word, Excel or PDF...
Thanks,
Blago
Re: HTML Table > database entry
"Blagovist" <blag [at] ovist.com> wrote in message
news:462f0f0f_3 [at] x-privat.org...
> Hi.
> Is there an easy way to "lift" data from HTML tables and enter that into
> my database? I'm a total novice and so far my searches have yielded
> little. I see Navicat has an import option, but that appears to be for
> well structured data like Word, Excel or PDF...
>
> Thanks,
>
> Blago
If you've got Excel, then you can "bounce" a table via that (copy / paste)
then use that to import via Navicat....
D.
--
googlegroups > /dev/nul
Re: HTML Table > database entry
Post removed (X-No-Archive: yes)
Re: HTML Table > database entry
Virginner wrote:
> "Blagovist" <blag [at] ovist.com> wrote in message
> news:462f0f0f_3 [at] x-privat.org...
>> Hi.
>> Is there an easy way to "lift" data from HTML tables and enter that into
>> my database? I'm a total novice and so far my searches have yielded
>> little. I see Navicat has an import option, but that appears to be for
>> well structured data like Word, Excel or PDF...
>>
>> Thanks,
>>
>> Blago
>
> If you've got Excel, then you can "bounce" a table via that (copy / paste)
> then use that to import via Navicat....
>
> D.
I found something called easywebsave (an IE add-on) that looks
promising. But still a long way from being automated.
Blaqgo
Re: HTML Table > database entry
Blagovist wrote:
> Virginner wrote:
>> "Blagovist" <blag [at] ovist.com> wrote in message
>> news:462f0f0f_3 [at] x-privat.org...
>>> Hi.
>>> Is there an easy way to "lift" data from HTML tables and enter that
>>> into my database? I'm a total novice and so far my searches have
>>> yielded little. I see Navicat has an import option, but that appears
>>> to be for well structured data like Word, Excel or PDF...
>>>
>>> Thanks,
>>>
>>> Blago
>>
>> If you've got Excel, then you can "bounce" a table via that (copy /
>> paste) then use that to import via Navicat....
>>
>> D.
>
> I found something called easywebsave (an IE add-on) that looks
> promising. But still a long way from being automated.
>
> Blaqgo
The following code relies heavily on your input html table being well-formatted
XHTML:
$text = "<table> [your table here] </table>";
/* first, strip the first and last tr tags.
preg_match('/<tr[^>]*>(.+)<\/tr>/',$text,$match);
$to_split=$match[1];
/* now split wherever a row is closed, then opened. */
$rows = preg_split('/<\/td>.*?<\/tr>.*?<tr[^>]*>.*?<td[^>]>/',$to_split);
foreach ($rows as $row)
{
// now split the rows into cells.
$cells[]=preg_split('/<\/td>.*?<td[^>]*>/',$row);
}
Your data is now split in a two-dimensional array. Putting it into a database is
pretty trivial after that.
--
cb
Re: HTML Table > database entry
On 26 Apr, 21:50, Christoph Burschka <christoph.bursc... [at] rwth-
aachen.de> wrote:
> Blagovist wrote:
> > Virginner wrote:
> >> "Blagovist" <b... [at] ovist.com> wrote in message
> >>news:462f0f0f_3 [at] x-privat.org...
> >>> Hi.
> >>> Is there an easy way to "lift" data from HTML tables and enter that
> >>> into my database? I'm a total novice and so far my searches have
> >>> yielded little. I see Navicat has an import option, but that appears
> >>> to be for well structured data like Word, Excel or PDF...
>
> >>> Thanks,
>
> >>> Blago
>
> >> If you've got Excel, then you can "bounce" a table via that (copy /
> >> paste) then use that to import via Navicat....
>
> >> D.
>
> > I found something called easywebsave (an IE add-on) that looks
> > promising. But still a long way from being automated.
>
> > Blaqgo
>
> The following code relies heavily on your input html table being well-formatted
> XHTML:
>
> $text = "<table> [your table here] </table>";
>
> /* first, strip the first and last tr tags.
> preg_match('/<tr[^>]*>(.+)<\/tr>/',$text,$match);
> $to_split=$match[1];
>
> /* now split wherever a row is closed, then opened. */
> $rows = preg_split('/<\/td>.*?<\/tr>.*?<tr[^>]*>.*?<td[^>]>/',$to_split);
>
> foreach ($rows as $row)
> {
> // now split the rows into cells.
> $cells[]=preg_split('/<\/td>.*?<td[^>]*>/',$row);
>
> }
>
> Your data is now split in a two-dimensional array. Putting it into a database is
> pretty trivial after that.
>
> --
> cb- Hide quoted text -
>
> - Show quoted text -
But what if that data had individual formatting. The data in one cell
could have a superscript or be in bold. All those tags would be
included.
Re: HTML Table > database entry
Post removed (X-No-Archive: yes)
Re: HTML Table > database entry
"Blagovist" <blag [at] ovist.com> wrote in message
news:463097cb_1 [at] x-privat.org...
> Virginner wrote:
>> "Blagovist" <blag [at] ovist.com> wrote in message
>> news:462f0f0f_3 [at] x-privat.org...
>>> Hi.
>>> Is there an easy way to "lift" data from HTML tables and enter that into
>>> my database? I'm a total novice and so far my searches have yielded
>>> little. I see Navicat has an import option, but that appears to be for
>>> well structured data like Word, Excel or PDF...
>>>
>>> Thanks,
>>>
>>> Blago
>>
>> If you've got Excel, then you can "bounce" a table via that (copy /
>> paste) then use that to import via Navicat....
>>
>> D.
>
> I found something called easywebsave (an IE add-on) that looks promising.
> But still a long way from being automated.
Ah! You didn't state "automated" in your OP, hence my suggestion about
Excel -> Navicat.
If you want it automated, then file_get_contents of the url into a string,
strip_tags except table related ones, then use a few explodes or preg_splits
to rip the reaming data into array(s).
D.
--
googlegroups > /dev/nul
Re: HTML Table > database entry
Captain Paralytic wrote:
> On 26 Apr, 21:50, Christoph Burschka <christoph.bursc... [at] rwth-
> aachen.de> wrote:
>> Blagovist wrote:
>>> Virginner wrote:
>>>> "Blagovist" <b... [at] ovist.com> wrote in message
>>>> news:462f0f0f_3 [at] x-privat.org...
>>>>> Hi.
>>>>> Is there an easy way to "lift" data from HTML tables and enter that
>>>>> into my database? I'm a total novice and so far my searches have
>>>>> yielded little. I see Navicat has an import option, but that appears
>>>>> to be for well structured data like Word, Excel or PDF...
>>>>> Thanks,
>>>>> Blago
>>>> If you've got Excel, then you can "bounce" a table via that (copy /
>>>> paste) then use that to import via Navicat....
>>>> D.
>>> I found something called easywebsave (an IE add-on) that looks
>>> promising. But still a long way from being automated.
>>> Blaqgo
>> The following code relies heavily on your input html table being well-formatted
>> XHTML:
>>
>> $text = "<table> [your table here] </table>";
>>
>> /* first, strip the first and last tr tags.
>> preg_match('/<tr[^>]*>(.+)<\/tr>/',$text,$match);
>> $to_split=$match[1];
>>
>> /* now split wherever a row is closed, then opened. */
>> $rows = preg_split('/<\/td>.*?<\/tr>.*?<tr[^>]*>.*?<td[^>]>/',$to_split);
>>
>> foreach ($rows as $row)
>> {
>> // now split the rows into cells.
>> $cells[]=preg_split('/<\/td>.*?<td[^>]*>/',$row);
>>
>> }
>>
>> Your data is now split in a two-dimensional array. Putting it into a database is
>> pretty trivial after that.
>>
>> --
>> cb- Hide quoted text -
>>
>> - Show quoted text -
>
> But what if that data had individual formatting. The data in one cell
> could have a superscript or be in bold. All those tags would be
> included.
>
Hopefully, that information is in the style attribute of the cell tag (and will
get split away, since <td[^>]*> matches a complete tag with all attributes). But
if there's markup inside the cell, strip_tags() will remove it.
--
cb