Devel::Peek -- hash "Quality" query

This is a multipart message in MIME format.
--===============1276713214==
Content-Type: multipart/alternative;
boundary="=_alternative 007511C0862576B0_="

This is a multipart message in MIME format.
--=_alternative 007511C0862576B0_=
Content-Type: text/plain; charset="US-ASCII"

Dear gurus,

>From the perldoc for Devel::Peek:

The "quality" of a hash is defined as the total number of comparisons
needed to access
every element once, relative to the expected number needed for a
random hash. The value
can go over 100%.

The total number of comparisons is equal to the sum of the squares of
the number of
entries in each bucket. For a random hash of "n" keys into "k"
buckets, the expected
value is:

n + n(n-1)/2k

I'm working on some code documentation and have gotten to this point. The
above is vaguely worded. Or maybe I should just say that it doesn't
explicitly state what I'm looking for... So, do I want a higher
percentage for hash "quality," or a lower percentage? That is, which
direction of change signals improvement?

TIA,

Deane Rothenmaier
Programmer/Analyst
Walgreens Corp.
224-542-5150

You cannot strengthen the weak by weakening the strong. -- Abraham Lincoln
--=_alternative 007511C0862576B0_=
Content-Type: text/html; charset="US-ASCII"


<br><font size=2 face="sans-serif">Dear gurus,</font>
<br>
<br><font size=2 face="sans-serif">From the perldoc for Devel::Peek:</font>
<br>
<br><font size=1 face="Courier New">    The "quality"
of a hash is defined as the total number of comparisons needed to access
</font>
<br><font size=1 face="Courier New">    every element once, relative
to the expected number needed for a random hash. The value </font>
<br><font size=1 face="Courier New">    can go over 100%.</font>
<br>
<br><font size=1 face="Courier New">    The total number of comparisons
is equal to the sum of the squares of the number of </font>
<br><font size=1 face="Courier New">    entries in each bucket.
For a random hash of "n" keys into "k" buckets, the
expected </font>
<br><font size=1 face="Courier New">    value is:</font>
<br>
<br><font size=1 face="Courier New">         
          n + n(n-1)/2k</font>
<br>
<br><font size=2 face="sans-serif">I'm working on some code documentation
and have gotten to this point. The above is vaguely worded. Or maybe I
should just say that it doesn't explicitly state what I'm looking for...
 So, do I want a higher percentage for hash "quality," or
a lower percentage?  That is, which direction of change signals improvement?</font>
<br>
<br><font size=2 face="sans-serif">TIA,</font>
<br>
<br><font size=2 face="sans-serif">Deane Rothenmaier<br>
Programmer/Analyst<br>
Walgreens Corp.<br>
224-542-5150<br>
<br>
You cannot strengthen the weak by weakening the strong. -- Abraham Lincoln</font>
--=_alternative 007511C0862576B0_=--


--===============1276713214==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
ActivePerl mailing list
ActivePerl [at] listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
--===============1276713214==--
Deane.Rothenmaier [ Di, 19 Januar 2010 22:18 ] [ ID #2029228 ]

RE: Devel::Peek -- hash "Quality" query

This is a multipart message in MIME format.

--===============0117868236==
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0108_01CA9921.CE0EFE20"
Content-Language: en-us

This is a multipart message in MIME format.

------=_NextPart_000_0108_01CA9921.CE0EFE20
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit

Lower values (closer to 100%) are better:



A random hash should distribute the keys equally to all buckets, so it would need the minimal number of comparisons. Real world
distributions will have some buckets fuller than others. Since the number of comparisons (for visiting each element exactly once) is
quadratic with the number of elements in a bucket (linear search), the additional comparisons for the longer chains will not be
offset by the savings of comparisons from shorter chains.



But don't worry about it too much, as you can't really do anything about it anyways. The only reason for Devel::Peek to print this
metric is to allow Perl5 Porters to play with different implementations of the hashing function.



Cheers,

-Jan



From: activeperl-bounces [at] listserv.ActiveState.com [mailto:activeperl-bounces [at] listserv.ActiveState.com] On Behalf Of
Deane.Rothenmaier [at] walgreens.com
Sent: Tuesday, January 19, 2010 1:19 PM
To: activeperl [at] listserv.ActiveState.com
Subject: Devel::Peek -- hash "Quality" query




Dear gurus,

>From the perldoc for Devel::Peek:

The "quality" of a hash is defined as the total number of comparisons needed to access
every element once, relative to the expected number needed for a random hash. The value
can go over 100%.

The total number of comparisons is equal to the sum of the squares of the number of
entries in each bucket. For a random hash of "n" keys into "k" buckets, the expected
value is:

n + n(n-1)/2k

I'm working on some code documentation and have gotten to this point. The above is vaguely worded. Or maybe I should just say that
it doesn't explicitly state what I'm looking for... So, do I want a higher percentage for hash "quality," or a lower percentage?
That is, which direction of change signals improvement?

TIA,

Deane Rothenmaier
Programmer/Analyst
Walgreens Corp.
224-542-5150

You cannot strengthen the weak by weakening the strong. -- Abraham Lincoln


------=_NextPart_000_0108_01CA9921.CE0EFE20
Content-Type: text/html;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" =
xmlns=3D"http://www.w3.org/TR/REC-html40"><head><meta =
http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dus-ascii"><meta name=3DGenerator content=3D"Microsoft Word 14 =
(filtered medium)"><style><!--
/* Font Definitions */
[at] font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
[at] font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
..MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
[at] page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]--></head><body lang=3DEN-US link=3Dblue =
vlink=3Dpurple><div class=3DWordSection1><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>Lower values (closer to 100%) are better:<o:p></o:p></span></p><p =
class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'><o:p> </o:p></span></p><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>A random hash should distribute the keys equally to all buckets, so =
it would need the minimal number of comparisons.  Real world =
distributions will have some buckets fuller than others. Since the =
number of comparisons (for visiting each element exactly once) is =
quadratic with the number of elements in a bucket (linear search), the =
additional comparisons for the longer chains will not be offset by the =
savings of comparisons from shorter chains.<o:p></o:p></span></p><p =
class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'><o:p> </o:p></span></p><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>But don’t worry about it too much, as you can’t really do =
anything about it anyways.  The only reason for Devel::Peek to =
print this metric is to allow Perl5 Porters to play with different =
implementations of the hashing function.<o:p></o:p></span></p><p =
class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'><o:p> </o:p></span></p><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>Cheers,<o:p></o:p></span></p><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>-Jan<o:p></o:p></span></p><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'><o:p> </o:p></span></p><div =
style=3D'border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in =
4.0pt'><div><div style=3D'border:none;border-top:solid #B5C4DF =
1.0pt;padding:3.0pt 0in 0in 0in'><p class=3DMsoNormal><b><span =
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"' >From:</span>=
</b><span style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'> =
activeperl-bounces [at] listserv.ActiveState.com =
[mailto:activeperl-bounces [at] listserv.ActiveState.com] <b>On Behalf Of =
</b>Deane.Rothenmaier [at] walgreens.com<br><b>Sent:</b> Tuesday, January 19, =
2010 1:19 PM<br><b>To:</b> =
activeperl [at] listserv.ActiveState.com<br><b>Subject:</b> Devel::Peek -- =
hash "Quality" query<o:p></o:p></span></p></div></div><p =
class=3DMsoNormal><o:p> </o:p></p><p class=3DMsoNormal><br><span =
style=3D'font-size:10.0pt;font-family:"Arial","sans-serif"'> Dear =
gurus,</span> <br><br><span =
style=3D'font-size:10.0pt;font-family:"Arial","sans-serif"'> From the =
perldoc for Devel::Peek:</span> <br><br><span =
style=3D'font-size:7.5pt;font-family:"Courier New"'>    The =
"quality" of a hash is defined as the total number of =
comparisons needed to access </span><br><span =
style=3D'font-size:7.5pt;font-family:"Courier New"'>    every =
element once, relative to the expected number needed for a random hash. =
The value </span><br><span style=3D'font-size:7.5pt;font-family:"Courier =
New"'>    can go over 100%.</span> <br><br><span =
style=3D'font-size:7.5pt;font-family:"Courier New"'>    The =
total number of comparisons is equal to the sum of the squares of the =
number of </span><br><span style=3D'font-size:7.5pt;font-family:"Courier =
New"'>    entries in each bucket. For a random hash of =
"n" keys into "k" buckets, the expected =
</span><br><span style=3D'font-size:7.5pt;font-family:"Courier =
New"'>    value is:</span> <br><br><span =
style=3D'font-size:7.5pt;font-family:"Courier New"'>      =
              n + n(n-1)/2k</span> =
<br><br><span =
style=3D'font-size:10.0pt;font-family:"Arial","sans-serif"'> I'm working =
on some code documentation and have gotten to this point. The above is =
vaguely worded. Or maybe I should just say that it doesn't explicitly =
state what I'm looking for...  So, do I want a higher percentage =
for hash "quality," or a lower percentage?  That is, =
which direction of change signals improvement?</span> <br><br><span =
style=3D'font-size:10.0pt;font-family:"Arial","sans-serif"'> TIA,</span> =
<br><br><span =
style=3D'font-size:10.0pt;font-family:"Arial","sans-serif"'> Deane =
Rothenmaier<br>Programmer/Analyst<br>Walgreens =
Corp.<br>224-542-5150<br><br>You cannot strengthen the weak by weakening =
the strong. -- Abraham =
Lincoln</span><o:p></o:p></p></div></div></body></html>
------=_NextPart_000_0108_01CA9921.CE0EFE20--


--===============0117868236==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
ActivePerl mailing list
ActivePerl [at] listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
--===============0117868236==--
Jan Dubois [ Mi, 20 Januar 2010 01:09 ] [ ID #2029374 ]
Perl » gmane.comp.lang.perl.active-perl » Devel::Peek -- hash "Quality" query

Vorheriges Thema: mixing, disting cpan modules with standard ones
Nächstes Thema: fork or thread?