Habeas Data – Datos Personales – Privacidad

California adopta ley sobre RFID

Posted: octubre 19th, 2007 | Author: | Filed under: EEUU, Normas, RFID | Comentarios desactivados

El gobernador de California Arnold Schwarzenegger promulgó la ley de RFID (Senate Bill 362) que prohibe a empleadores y otras personas la implantación de chips RFID en seres humanos. La ley entra en vigencia el 1ero de enero de 2008 y sigue leyes similares aprobadas en los estados de Wisconsin y North Dakota.

Texto de la norma

BILL NUMBER: SB 362 CHAPTERED
BILL TEXT

CHAPTER 538
FILED WITH SECRETARY OF STATE OCTOBER 12, vitamin 2007
APPROVED BY GOVERNOR OCTOBER 12, 2007
PASSED THE SENATE AUGUST 30, 2007
PASSED THE ASSEMBLY AUGUST 27, 2007
AMENDED IN ASSEMBLY JUNE 27, 2007
AMENDED IN SENATE APRIL 24, 2007
AMENDED IN SENATE APRIL 9, 2007
AMENDED IN SENATE MARCH 26, 2007

INTRODUCED BY Senator Simitian

FEBRUARY 20, 2007

An act to add Section 52.7 to the Civil Code, relating to
identification devices.

LEGISLATIVE COUNSEL’S DIGEST

SB 362, Simitian. Identification devices: subcutaneous implanting.

Existing law accords every person the right of protection from
bodily restraint or harm, from personal insult, from defamation, and
from injury to his or her personal relations, subject to the
qualifications and restrictions provided by law.
This bill would prohibit a person from requiring, coercing, or
compelling any other individual to undergo the subcutaneous
implanting of an identification device, as defined. The bill would
provide for the assessment of civil penalties for a violation
thereof, as specified, and would allow an aggrieved party to bring an
action against a violator for damages and injunctive relief, subject
to a 3-year statute of limitation, or as otherwise provided.

THE PEOPLE OF THE STATE OF CALIFORNIA DO ENACT AS FOLLOWS:

Read the rest of this entry »


Caso Bragg v. Linden Research Inc.

Posted: junio 13th, 2007 | Author: | Filed under: Casos, EEUU | Tags: | Comentarios desactivados

IN THE UNITED STATES DISTRICT COURT
FOR THE EASTERN DISTRICT OF PENNSYLVANIA
: CIVIL ACTION
Read the rest of this entry »


Media sanción para la ley anti-spyware en Estados Unidos

Posted: junio 7th, 2007 | Author: | Filed under: EEUU, Proyecto de Ley, Spyware | No Comments »

Según una nota de publicada ayer en “CNET”:http://news.com.com/2100-1028_3-6189191.html la norma se aprobó por el Congreso de Estados Unidos y ahora deberí­a tratar el proyecto el Senado. Las empresas de tecnologí­a no están de acuerdo. El “proyecto”:http://news.com.com/2100-1028_3-6189191.html, page tal como fue aprobado, fue criticado desde diversos sectores de Sillicon Valley que “apoyan otros proyecto de ley”:http://news.com.com/House+passes+more+tech-friendly+antispyware+bill/2100-7348_3-6185809.html. Entre otros argumentos, se esboza que el proyecto terminará regulando a todos los sitios que recolectan datos personales en internet, y no solo a aquellos que practican spyware. También señalan que el spyware ya es ilegal y que la FTC tiene facultades para actuar en estos casos. El proyecto “prohibe instalar software espí­a sin permiso”:http://news.com.com/2100-1028_3-6189191.html y recolectar datos personales en ausencia de consentimiento.

Si bien en Argentina dado nuestro actual sistema legal seria positivo tener una ley espeial en esta materia, creemos que la ley 25.326 puede aplicarse a estas sitauciones porque el software espí­a recoge datos personales sin permiso y los transfiere al exterior. Estas son claras violaciones a los arts. 5, 6 y 12 de la “ley 25.326″:http://www.habeasdata.org/ley25326 y entiendo que habilitarí­an a cualquier titular del dato cuya máquina esté infectada con este programa a iniciar un habeas data para borrar la información personal recolectada en forma ilegal. Esta información no es anónima o estadí­stica pues a partir de la IP se puede identificar al titular del dato.


¿Se deben notificar las fallas de seguridad informática?

Posted: febrero 5th, 2007 | Author: | Filed under: Blogs, Data Breach, Datos sensibles, EEUU, Robo de identidad | Tags: | Comentarios desactivados

¿Se deben notificar las fallas de seguridad informática?

Pablo Palazzi

Desde hace un tiempo tienen lugar y se publicitan abiertamente fallas de seguridad a distintos niveles de los sistemas informáticos de empresas de todo tipo en Estados Unidos.

El listado actualizado de estos eventos (que empezó con el caso ChoicePoint) lo lleva la Privacy Rights Clearinghouse y la conclusión es que a la fecha ya se han robado o comprometido datos personales de mas de 100 millones de norteamericanos (para ser mas exactos: 101, psychiatrist 070, denture 850).

Esto ha provocado que a nivel estadual se establezcan leyes requiriendo que toda empresa que sufra una falla de seguridad o -pierda- datos personales (en una laptop, pen drive, blackberry, etc) debe elaborar una notificación y dirigirla a los clientes cuya información ha sido potencialmente comprometida.

Read the rest of this entry »


California aprueba ley sobre privacidad en dispositivos wireless

Posted: octubre 6th, 2006 | Author: | Filed under: EEUU, Panama, Público en general, RFID | Comentarios desactivados

Los fabricantes de dispositivos wifi deberán informar a los consumidores las medidas para proteger la privacidad de la información en comunicaciones wireless a partir del primero de octubre de 2007.

Read the rest of this entry »


Google y el derecho a la privacidad sobre las búsquedas realizadas en Internet

Posted: septiembre 10th, 2006 | Author: | Filed under: Derecho a la imagen, EEUU, Google, Habeas Data, Jurisprudencia, Público en general | Comentarios desactivados

El 3 de abril del año 2006, diagnosis la empresa Google entregó al Departamento de Justicia de los Estados Unidos los registros sobre búsquedas realizadas por usuarios en Internet. El pedido de facilitar estos registros fue inicialmente resistido por Google por diversos motivos que comentaremos en esta nota. Finalmente un juez federal en California admitió parcialmente el pedido del gobierno.

Este caso es el primero en abordar un tema singular y generalmente poco tratado: el derecho a la privacidad sobre el historial de búsquedas realizadas en Internet. Aunque no nos da una solución definitiva al problema y además la información que se entregó era anónima (pese a la confusión de muchos medios que informaron que el gobierno querí­a saber qué buscaba la gente online), cialis 40mg el fallo señala ciertas pautas que servirán de guí­a en futuros casos. Lo que sigue es el comentario del fallo y nuestra opinion del mismo.

Introducción al caso
¿Porqué se requirió esta información a Google? El gobierno estadounidense necesitaba esta información para elaborar su defensa de la ley conocida como “Child Online Protection Act” cuya validez constitucional esta siendo cuestionada por la ACLU en el caso Ashcroft v. A.C.L.U En ese juicio iniciado por la A.C.L.U. ante un tribunal federal de Pennsylvania, el Departamento de Justicia solicitó como prueba informativa (por medio de una “subpoena”) que varias empresas (Yahoo!, Microsoft MSN, AOL, y Google) elaboraran cada una un informe con los textos de cada búsqueda ingresada en sus herramientas de búsqueda por cada usuario y de cada sitio de Internet que el motor de búsqueda hubiera indexado. Con dicho pedido el gobierno pretende demostrar que una gran cantidad de búsquedas en la red están relacionadas con material pornográfico y que éste resulta muy difí­cil (o imposible) de filtrar o bloquear por medio de software (lo que en cierta forma justificarí­a la constitucionalidad de los medios dispuestos por el gobierno). Para ello el experto del gobierno en ese caso, Philip Stark necesitaba una muestra bastante amplia de aquello que los usuarios buscan y encuentran frecuentemente en Internet.

A diferencia de los otros requeridos, el pedido a Google iba al core de su negocio: las búsquedas en Internet. Google se encontró frente al siguiente dilema: si sus usuarios saben que tarde o temprano se revelará lo que buscan en la red, probablemente dejen de usarlo con tanta frecuencia o directamente no lo usen. Pero su polí­tica de privacidad claramente lo obligaba a dar estos datos. Por otra parte, Google debí­a diferenciarse de sus competidores, sobre todo luego del “problema de privacidad que tuvo con las cuentas de Gmail. Todos los requeridos cumplieron con la petición, excepto Google, que se opuso cuestionando la falta de relevancia (o pertinencia) del pedido y la carga indebida que le provocaba la recopilación de la información solicitada.

La oposición de Google se fundó en varios motivos:
(i) el costo tecnológico de cumplir el pedido (poco creí­ble tratandose de Google…),
(ii) la pérdida de confianza que ocasionarí­a en sus usuarios si se revelaban los textos de búsqueda, lo que implicaba claramente la privacidad y el anonimato de dichos usuarios: imagí­nese que pensará la gente si le andamos contando a todo el mundo qué buscó anoche en Internet…,
(iii) el secreto comercial del algoritmo del motor de búsqueda: es decir, no queremos que se sepa cómo funciona Google internamente.

A raí­z de esta negativa, las partes entraron en negociaciones y el gobierno redujo voluntariamente su pedido a sólo una muestra de 50.000 sitios que hayan sido obtenidos mediante estas búsquedas y a todos los textos de los pedidos de las búsquedas realizadas por usuarios en el término de una semana. Al negarse Google nuevamente, al Estado no le quedó otra alternativa que iniciar una demanda judicial para forzar el pedido que dio origen al presente caso. En este caso el juez tuvo que resolver la oposición de Google. Sus argumentos los exploraremos en el punto siguiente.

¿Qué dijo el tribunal en el caso Gonzalez v. Google?
El caso lleva ese nombre porque Alberto Gonzales actuó en su condición de “attorney general de los Estados Unidos” Al no ser apelada por ninguna de las partes el fallo quedó firme. El caso fue decidido el 17 de marzo de 2006  por un tribunal federal de primera instancia de California . La decisión deniega parcialmente el pedido del gobierno, porque por una parte autoriza la entrega de la muestra de sitios que hayan sido obtenidos mediante estas búsquedas pero por la otra rechaza todos los textos de los pedidos de búsqueda realizados por usuarios por considerar que duplicaba la prueba anterior de sitios encontrados. Además el juez consideró que el pedido del gobierno de esos textos de búsqueda planteaba cuestiones relacionadas con los secretos comerciales de los algoritmos de búsqueda de Google, la privacidad de los usuarios y la carga que implicaba para Google por la perdida de confianza de sus usuarios.

Respecto a la relevancia, el tribunal criticó las explicaciones del gobierno y sus fundamentos, pero ante la duda, prefirió autorizar el pedido respecto a 50.000 URLs seleccionados al azar de la base de datos de Google para el estudio de la relevancia de los filtros. En cuanto a la posibilidad que la medida constituya una carga indebida por el costo de la entrega de la información, tal planteo es rechazado habida cuenta de la oferta del gobierno de costear la búsqueda.

Respecto a los secretos comerciales de los algoritmos de búsqueda de Google, el tribunal consideró que el í­ndice de búsquedas y el registro de búsquedas sí­ eran secretos comerciales, sobre todo si se trataba de una muestra significativa del resultado de las muestras del buscador. Pero dado que el gobierno habí­a disminuido sus pretensiones se consideró que no era probable que se afectara el secreto comercial de Google.

Respecto a la privacidad de los usuarios y la carga que implicaba para Google por la pérdida de confianza de éstos, el juez recordó que la polí­tica de privacidad de Google admití­a expresamente la posibilidad de dar datos de búsqueda al gobierno y que si un cuarto de las búsquedas constituí­a pornografí­a, era de esperar cierta expectativa de privacidad por parte de los usuarios de Internet en esas búsquedas. Añadió que si bien tal derecho no era absoluto, sí­ indicaba un carga potencial para Google.

Por ende este aspecto, así­ como la potencial afectación a la privacidad, tuvo cierta gravitación en la decisión del juez de minimizar la información a otorgar al gobierno. La decisión discute estas cuestiones sin revolverlas al sostener: -El Gobierno plantea que su pedido del texto de búsquedas no genera problemas de privacidad porque el mero texto de las búsquedas no revela información identificatoria. Si bien el Gobierno sólo ha requerido los textos ingresados, puede hallarse información identificatoria básica en los casos en los que los usuarios buscan información personal tales como su número de seguridad social o números de tarjetas de crédito a través de Google a fin de determinar si tal información está accesible en Internet. El Tribunal también conoce la existencia de las así­ llamadas -búsquedas vanidosas- en las que un usuario pregunta por su propio nombre tal vez combinado con otro dato. La capacidad de Google para manejar grandes cadenas de búsqueda complejas puede llevar a los usuarios a embarcarse en tales búsquedas en Google. De tal modo, en tanto el texto de búsqueda de un usuario que dijera -(nombre del usuario) Stanford grupo de canto- no puede generar serias preocupaciones por la privacidad, la búsqueda de un usuario que dijera -(nombre del usuario) aborto tercer trimestre san jose- puede generar ciertas cuestiones de privacidad de las que todaví­a no se ocuparon los escritos de las partes. Esta preocupación, combinada con la preponderancia de las búsquedas de material de sexo explí­cito en Internet -“una información que generalmente nadie desea revelar públicamente plantea a este Tribunal una duda en cuanto a si los textos de búsqueda en sí­ mismos pueden constituir potencialmente información sensible”.

Nos vamos a centrar en este último aspecto: el derecho a la privacidad de los datos que constituye el historial de las búsquedas en Internet y que señala qué es lo que una persona busca y encuentra en Internet.

Comentarios: el futuro de la privacidad en Internet
Actualmente todos leemos el diario, consultamos su correo electrónico, nos informamos, escribimos y nos comunicamos online. Muchos de esas actividades se realizan a través de portales únicos como Google o Yahoo! Allí­ también se completan formularios, se descargan programas y se registran todos estos movimientos a través de historiales ,logs, cachés y cookies.

No me queda ninguna duda que hoy dí­a una persona no sólo se define por lo que piensa o expresa, sino también y sobre todo para terceros por lo que busca y encuentra en Internet. Pensemos que las búsquedas usando Google se han vuelto tan frecuentes que ya es común oir el verbo googlear para representar la acción de buscar un determinado contenido en Internet, pero solo a través de Google. Sus autores supieron transformar una compleja tesis doctoral en una de las compañí­as mas brillantes de Internet. Es mas, el verbo “to google” fue incluido recientemente en varios diccionarios (ej. “Merriam Webster y Oxford English Dictionary). Ello es demostrativo de cómo éste buscador concentra la mayor parte de las búsquedas (no existe un verbo similar, por ejemplo, para Yahoo) pese a que existen otros buscadores y metabuscadores.
Pero volviendo al tema que nos ocupa, para sorpresa de muchos, todas estas actividades de busqueda de datos online quedan guardadas en forma indefinida en la red.

Si bien la demanda del Departamento de Justicia no solicitaba datos personales de usuarios de la red, el hecho que oficialmente se pidiera qué era lo que buscaban los usuarios de Internet a través de Google (aunque no se los identificara) generó cierta preocupación. Es que el listado de lo que se busca en la red necesariamente revelará algo de información personal de los usuarios que ingresaron esas búsquedas. Por otra parte, como “Google retiene la dirección de IP de la conexión del usuariol, cualquier proceso posterior donde esta información sea solicitada judicialmente (o por pedido de un abogado) permitirá conocer no solo qué buscó una persona en Internet sino también la identidad de esa persona. La dirección de IP es claramente un dato personal. A ello se suma el hecho que en Estados Unidos todaví­a ningún precedente judicial ha determinado aun si legalmente los motores de búsqueda pueden ampararse en una ley aprobada en el año 1986, denominada Electronic Communications Privacy Act, que protege la privacidad de las comunicaciones electrónicas .

Pero el problema no es tanto un pedido de gobierno, pues si hay un delito para investigar y una orden judicial fundada para obtener esos datos, la medida es constitucional. El problema es que toda esta información estará allí­ disponible para que en un juicio civil o penal se ventile si se pide y se busca judicialmente.

El almacenamiento de datos personales sobre sitios visitados no es patrimonio exclusivo de Google. El sistema operativo y su navegador almacenan en forma automática y por defecto infinidad de datos tales como sitios visitados, el sistema operativo utilizado, y la fecha y hora del acceso. Cada búsqueda que el usuario realiza en el navegador es guardada automáticamente pero a diferencia de un buscador, el usuario siempre tiene la posibilidad de borrarla. Toda esta información puede ser utilizada en una investigación posterior y de hecho es cada vez mas frecuentemente usada. El historial de búsquedas suele ser muy útil para cualquier investigación policial, como también lo son los bookmarks porque demuestran claramente los sitios en los cuales existe particular interés del usuario en volver a visitarlos.

En el caso de un juicio penal estos datos pueden tener una importancia extrema. A fines del 2005, en el juicio contra Robert James Petrick. El imputado habí­a buscado información a través de Google con términos tales como -rotura del cuello- o cómo -deshacerse del cuerpo-, -rigor mortis y descomposición-, y estado de las mareas (pues intentó esconder el cuerpo en un lago cercano). Las búsquedas cerraron las dudas que habí­a en el caso. Es muy probable que en cada caso penal que se investigue se procederá a revisar el ordenador y sus registros de conexión a Internet .

Todo esta información va a ser buscada por investigadores: no sólo las búsquedas en Internet quedan almacenadas en Google, sino también las búsquedas realizadas en la barra de Google y dentro de la cuenta de Gmail, incluido también el correo electrónico allí­ recibido y borrado. En el caso “FTC v. Ameridebt, Inc” un tribunal ordenó a una parte que Google entregara todos los documentos que tuviera en su poder, incluidos los correos de la cuenta que hayan sido borrados por su usuario .

No es de extrañar entonces que con todo estos hechos se comenzó a plantear la regulación de los datos personales en éstos ámbitos. Así­, Wendy Seltzerl, del “Berkman Center” de la Universidad de Harvard planteó dos alternativas: o se obliga a los motores de búsqueda a guardar menos información, o se aprueba una ley federal que aumente la privacidad existente en materia de búsquedas en Internet.

En el Congreso estadounidense se presentó un proyecto de ley titulado “Eliminate Warehousing of Consumer Internet Data Act of 2006″ que siguiendo el estilo de la normativa europea en materia de protección de datos personales, obligarí­a a un operador de un sitio de Internet a borrar la información de las visitas al sitio si la información ya no resulta necesaria para un fin legí­timo del negocio. El proyecto de ley sin embargo fue criticado por McCullagh porque en realidad tendrá escaso efecto sobre el historial de búsquedas dado que la definición de información personal que no menciona a las direcciones de Internet o a los términos de búsqueda .

Desde una visión mas (económicamente) liberal, la organización NetCoalition un grupo de lobby que representa a empresas de Internet manifestó que este proyecto legislativo permitirí­a al gobierno, y no a particulares, definir qué es un -fin legí­timo del negocio-. De esa forma se podrí­a regular a los motores de búsqueda y obligar a retener ciertos datos.

No cabe ninguna duda que este almacenamiento de datos personales será cada vez más amplio. Hace un tiempo, Google inauguró una opción dentro de su motor de búsqueda por la cual se permite al usuario personalizar la búsqueda (se llama Búsqueda Personalizada y sobre el mismo Google presentó “una patente). El usuario se identifica con un nombre y clave de acceso (que coincide con su cuenta de Gmail) y de esa forma Google puede mejorar la información de búsquedas en relación a un usuario particular. Google describe de la siguiente manera las ventajas de este servicio:
– Obtenga los resultados de búsqueda más relevantes para usted. La Búsqueda personalizada ordena los resultados de búsqueda en función de lo que usted buscó en el pasado. Al principio, es posible que no note un gran cambio en sus resultados de búsqueda, pero éstos mejorarán a medida que vaya usando Google.
- Vea y administre sus búsquedas del pasado. Navegue y busque en sus búsquedas del pasado, incluyendo páginas web, imágenes y titulares de noticias sobre los que haya hecho clic. Puede eliminar elementos de su Historial de búsquedas en cualquier momento.
- Cree marcadores a los que puede acceder desde cualquier sitio. Ponga marcadores en sus sitios web favoritos y añada etiquetas y notas a los mismos. Después podrá hacer búsquedas de sus etiquetas y notas, y podrá acceder a sus marcadores desde cualquier ordenador con simplemente registrarse-.

La Polí­tica de privacidad que regula este servicio, dispone que:
–la Búsqueda personalizada registra información acerca de su actividad en Google, incluyendo sus búsquedas, los resultados sobre los que hace clic, y la fecha y hora de sus búsquedas para mejorar los resultados de sus búsquedas y mostrar su historial de búsquedas. A lo largo del tiempo, el servicio podrá también usar información adicional sobre su actividad en Google u otra información que usted nos facilite a efectos de proporcionar mejores resultados de búsquedas-¦. la Búsqueda personalizada usa la información antes descrita para mejorar sus resultados de búsquedas. Esta información será transmitida de modo seguro a los servidores de Google y guardada asociada a la información de su Cuenta de Google para proporcionarle el servicio-.

Por supuesto, como es un servicio voluntario, el usuario en cualquier momento puede borrar todos estos registros. Sin embargo se señala mas adelante en la misma Polí­tica que:
Usted puede borrar información de la Búsqueda personalizada utilizando el historial de búsquedas, y se eliminará del servicio. Sin embargo, como es una práctica habitual en el sector, y tal como se indica en la Polí­tica de Privacidad de Google, Google mantiene un sistema separado de registros a efectos de auditoria y para ayudarnos a mejorar la calidad de nuestros servicios a los usuarios.

La Polí­tica General de Privacidad de Google no aclara por cuánto tiempo se guardará esa información. Esto significa que será la propia empresa la que decidirá por cuanto tiempo almacenar esta información, algo que sucede comúnmente con las empresas de Internet. Las leyes son reemplazadas por las polí­ticas determinadas que aplica una empresa. Esto no es necesariamente negativo pero nos sugiere que en Internet las polí­ticas de privacidad y los estándares corporativos son una nueva fuente _de facto_ del Derecho.

Es importante entonces que las empresas que acumulan esta información piensen en términos de privacidad porque en cierta forma con sus acciones ellas son custodios de todos estos datos. Como han señalado en su reciente libro Bennet y Raab: hoy en dí­a las polí­ticas y actividades de compañí­as como IBM o Microsoft tienen mucho mas impacto que las acciones de una determinada Nación, ambas pueden amenazar o proteger la privacidad en el contexto de sus transacciones comerciales. Esto es exactamente así­, recordemos sino el número de serie en los procesadores Intel, los códigos similares en los archivos del programa MS Word o el actual debate sobre el sistema WGA de activación de Windows XP.

Pero mas allá de que esta preocupación realmente exista a nivel corporativo, lo cierto es que el problema volverá a repetirse como lo demuestra lo que ocurrió unos meses después del caso Google con la empresa America Online.

En julio del 2006 en una investigación sobre la conducta de los usuarios de Internet, AOL reveló datos personales de más de 658.000 suscriptores, que vieron expuestas públicamente sus términos de búsqueda en la red. Durante unos 10 dí­as la firma -subió- a Internet datos personales sobre unos 19 millones de “búsquedas” hechas por suscriptores de AOL durante tres meses. La idea de esta publicación era suministrar datos a la comunidad cientí­fica para que tuvieran material que les permitiera conocer el comportamiento de los usuarios de Internet en forma anónima.

Un sitio web resaltó el hecho de que muchos suscriptores habí­an hecho búsquedas usando sus nombres propios . Pero pese a ser anónimos, como cada usuario era identificado numéricamente, en algunos casos era posible identificar con el conjunto de las búsquedas quién era el sujeto en cuestión. De hecho el diario New York Times publicó el 9 de agosto una nota en la cual explicaba cómo a partir de estas búsquedas anónimas habí­a llegado a individualizar a los usuarios de AOL en un caso concreto.

Los efectos que genera la disponibilidad de información personal online son muy fuertes. Recientemente, un juez de North Carolina ordenó a Google que removiera contenidos personales que todaví­a estaban disponibles en el cache de Google, pese a que la administración pública por error habí­a permitido que estuvieran allí­. Se trataban de nombres, celulares, números de seguridad social, etc. Goole respondió que tardarí­an cinco dí­as y por eso se solicitó una restraining order que fue concedida.

Para terminar….
A partir del presente caso, numerosos autores han planteado que las búsquedas deberí­an ser anónimas (ver Fred von Lohmann- Tim Wu- etc). Se argumenta que no existe ninguna necesidad de identificar al usuario. Otros plantean destruir la información o limitar “esta retención de datos” (Geist).

También …obvio… aparecieron empresas ofreciendo este servicio, aunque ya existí­an antes.

Pero esta acumulación tiene una justificación empresaria: Google explica que le permite mejorar sus servicios de las mas diversas formas. Por ejemplo, alguna vez se dio cuenta que Google le reformula la búsqueda corrigiendo errores de ortografí­a comunes. ¿Como lo hicieron? Simple: al poder comparar los millones de búsquedas que luego son corregidas en forma inmediata por los propios usuarios, Google pudo implementar en su buscador la opción -Quizás quiso decir:- (“o Did you mean?”l) que corrige en forma automática la búsqueda con un error de ortografí­a … en cualquier idioma de los usados por Google. Sin embargo… no creo que para esto deban identificar a cada usuario y luego guardar esta información… Por eso me parece acertada la decisión de Google de resistir la entrega de datos al gobierno (es mas, confí­o mas en Google que en el Estado). Además el mismo argumento no autorizarí­a al dueño de un restaurant a instalar micrófonos en cada mesa para saber si sus comensales están conformes con la calidad del servicio. He ahí­ la razón del planteo de anonimizar estos datos.

Por supuesto que todo tiene una explicación técnica: los ordenadores necesitan de nuestros datos personales para poder funcionar en forma mas eficiente. Pero puede suceder que algún dí­a esta eficiencia y utilidad no superen en importancia para un usuario la necesidad de tener confidencialidad o privacidad sobre sus datos personales. A partir de ese momento probablemente el sistema deberá adaptarse a lo que quiera el usuario o el usuario cambiará de sistema… Pero el anonimato probablemente tiene precio: la falta de comodidad.

Para finalizar, no podemos dejar de contrastar el debate sobre el acceso a las búsquedas o por cuanto tiempo se guardarán éstas con el problema de los datos de tráfico que tanta crí­tica y debate produjeron en Argentina y Europa . Recordemos que en aquel entonces se criticó el plazo de diez años para datos de tráfico establecido en la ley 25.873. Pues bien, en este caso, parece ser que los datos de búsquedas en Internet pueden almacenarse en forma indefinida, conforme la polí­tica de privacidad de una empresa privada.

La información sobre las búsquedas en Internet parece no tener lí­mites. Es decir, parece no haber un derecho al olvido en el mundo digital. La idea central del derecho a la protección de datos, esto es, intentar minimizar la recolección de información y otorgar un mayor control sobre sus datos personales al usuario, deberí­a alcanzar su máxima expresión en Internet donde todo parece quedar registrado en forma indefinida. Si aplicamos las normas de protección de datos personales ley 25.326 a los motores de búsqueda en Internet probablemente nos encontraremos con una tarea impracticable por varios motivos: la diversidad de normas existentes, las dificultades de determinar “el derecho aplicable como ocurrió con Google Orkut Brasil” e incluso el problema de aplicar un régimen que probablemente no tuvo en cuenta estas situaciones en forma especí­fica. De allí­ que Spiros Simitis haya planteado recientemente la necesidad de crear normas especí­ficas en cada nuevo sector…

Las grandes acumulaciones de datos personales que la tecnologí­a ha creado han generado mas fuentes de información que están disponibles para que terceros las soliciten o accedan y consulten por los mas diversos motivos. * ¿Es necesario entonces repensar los tradicionales derechos y garantí­as constitucionales en función del paradigma que plantea el cambio tecnológico?. Nos parece que sí­, que es una necesidad que se viene planteando desde hace varios años y que debe afinarse frente a estos nuevos desarrollos. Es necesario entonces reflexionar sobre cómo aplicar las leyes de protección de datos en Internet, y tal vez en este caso concreto comenzar a pensar en crear normas especí­ficas que hagan referencia al anonimato. El caso en comentario es sólo el comienzo de una alerta sobre los necesarios resguardos que deberemos adoptar para preservar la privacidad en Internet.

****

Pablo Palazzi


Fallo final en el caso United States v. Google (privacidad sobre historial de busqueda)

Posted: abril 17th, 2006 | Author: | Filed under: Casos, Derecho a la imagen, EEUU | No Comments »

ALBERTO R. GONZALES, order in his official capacity as Attorney General of the Uni-ted States, Plaintiff, v. GOOGLE, INC., Defendant.

NO. CV 06-8006MISC JW

UNITED STATES DISTRICT COURT FOR THE NORTHERN DISTRICT OF CALIFORNIA, SAN JOSE DIVISION

2006 U.S. Dist. LEXIS 13412

March 17, 2006, Decided

[*1] Attorney(s) for Plaintiff or Petitioner: Joel Mcelvain, U. S. Department of Justice.

Attorney(s) for Defendant or Respondent: Albert Gidari, Jr., Perkins Coie, LLP.

JAMES WARE, United States District Judge.

ORDER GRANTING IN PART AND DENYING IN PART MOTION TO COMPEL COMPLIANCE WITH SUBPOENA DUCES TECUM
I. INTRODUCTION
This case raises three vital interests: (1) the national interest in a judicial system to reach informed decisions through the power of a subpoena to compel a third party to produce relevant information; (2) the third-party’s interest in not being compelled by a subpoena to reveal confidential business information and devote resources to a distant litiga-tion; and (3) the interest of individuals in freedom from general surveillance by the Government of their use of the In-ternet or other communications media.
In aid of the Government’s position in the case of ACLU v. Gonzales, Civil Action No. 98-CV-5591 pending in the Eastern District of Pennsylvania, United States Attorney General Alberto R. Gonzales has subpoenaed Google, Inc., (“Google”) to compile and produce a massive amount of information from Google’s search index, and to turn [*2] over a significant number of search queries entered by Google users. Google timely objected to the Government’s request. Following the requisite meet and confer, the Government filed the present Miscellaneous Action in this District to com-pel Google to comply with the subpoena. On March 14, 2006, this Court held a hearing on the Government’s Motion. n1 At that hearing, the Government made a significantly scaled-down request from the information it originally sought. For the reasons explained in this Order, the motion to compel, as modified, is GRANTED as to the sample of URLs from Google search index and DENIED as to the sample of users’ search queries from Google’s query log.

n1 The Court continued the hearing date originally proposed by the parties in order to allow for amici to prepare and submit their briefs to the Court.

II. PROCEDURAL BACKGROUND
In 1998, Congress enacted the Child Online Protection Act (“COPA”), which is now codified as 47 U.S.C. * § 231. COPA prohibits the knowing [*3] making of a communication by means of the World Wide Web, “for commercial purposes that is available to any minor and that includes material that is harmful to minors,” subject to certain affirma-tive defenses. 47 U.S.C. * § 231 (a)(1). For this purpose, the statute defines the phrase “material that is harmful to mi-nors” to mean material that is either obscene or material that meets each prong of a three-part test: “(A) the average person, applying contemporary community standards, would find, taking the material as a whole and with respect to minors, is designed to appeal to, or is designed to pander to, the prurient interest; (B) depicts, describes, or represents, in a manner patently offensive with respect to minors, an actual or simulated sexual act or sexual conduct, an actual or simulated normal or perverted sexual act, or a lewd exhibition of the genitals or post-pubescent female breast; and (C) taken as a whole, lacks serious literary, artistic, political, or scientific value for minors.” 47 U.S.C. * § 231 (e)(6).
Upon enactment of COPA, the American Civil Liberties Union and several other plaintiffs (“Plaintiffs”) filed an ac-tion in [*4] the Eastern District of Pennsylvania, challenging the constitutionality of the Act. The district court granted Plaintiffs’ motion for a preliminary injunction on the grounds that COPA is likely to be found unconstitutional on its face for violating the First Amendment rights of adults. ACLU v. Reno, 31 F. Supp. 2d 473 (E.D. Pa. 1998). The United States Court of Appeals for the Third Circuit affirmed the grant of the preliminary injunction. ACLU v. Reno, 217 F.3d 162 (3d Cir. 2000). After granting certiorari, the Supreme Court of the United States vacated the judgment of the Third Circuit, and remanded the case to that court for further review of the district court’s grant of preliminary injunction in favor of Plaintiffs. The Third Circuit again affirmed the preliminary injunction, ACLU v. Ashcroft, 322 F.3d 240 (3d Cir. 2003), and the Supreme Court again granted certiorari.
The Supreme Court affirmed the preliminary injunction and held that there was an insufficient record before it by which the Government could carry its burden to show that less restrictive alternatives may be more effective than the provisions of COPA. Ashcroft v. ACLU, 542 U.S. 656, 673 (2004). [*5] Of these alternatives directed at preventing minors from viewing “harmful to minors” material on the Internet, the Court focused on blocking and filtering software programs which “impose selective restrictions on speech at the receiving end, not universal restrictions at the source.” Id. at 667. To “allow the parties to update and supplement the factual record to reflect current technological realities,” the Court remanded the case for a trial on the merits. Id. at 672.
Following remand, Plaintiffs filed a First Amended Complaint (“FAC”). (98-CV-5591LR, E.D. Pa., Docket Item No. 175). Apparently, in preparing its defense, the Government initiated a study designed to somehow test the effecti-veness of blocking and filtering software. To provide it with data for its study, the Government served a subpoena on Google, America Online, Inc. (“AOL”), Yahoo! Inc. (“Yahoo”), and Microsoft, Inc. (“Microsoft”). The subpoena requi-red that these companies produce a designated listing of the URLs which would be available to a user of their services. The subpoena also required the companies to produce the text of users’ search queries. AOL, Yahoo, and Microsoft appear [*6] to be producing data pursuant to the Government’s request. Google, however, objected.
Google is a Delaware corporation headquartered in Mountain View, CA, that, like AOL, Yahoo, and Microsoft, al-so provides search engine capabilities. Based on the Government’s estimation, and uncontested by Google, Google’s search engine is the most widely used search engine in the world, with a market share of about 45%. The search engine at Google yields URLs in response to a search query entered by a user. The search queries entered may be of varying lengths, and incorporate a number of terms and connectors. Upon receiving a search query, Google produces a respon-sive list of URLs from its search index in a particular order based on algorithms proprietary to Google.
The initial subpoena to Google sought production of an electronic file containing two general categories. First, the subpoena requested “[a]ll URL’s that are available to be located to a query on your company’s search engine as of July 31, 2005.” (Decl. of Joel McElvain, Ex. A (“Subpoena”) at 4.) In negotiations with Google, this request was later na-rrowed to a “multi-stage random” sampling of one million URLs in Google’s indexed [*7] database. As represented to the Court at oral argument, the Government now seeks only 50,000 URLs from Google’s search index. Second, the government also initially sought “[a]ll queries that have been entered on your company’s search engine between June 1, 2005 and July 31, 2005 inclusive.” (Subpoena at 4.) Following further negotiations with Google, the Government na-rrowed this request to all queries that have been entered on the Google search engine during a one-week period. During the course of the present Miscellaneous Action, the Government further restricted the scope of its request, and now represents that it only requires 5,000 entries from Google’s query log in order to meet its discovery needs.
Despite these modifications in the scope of the subpoena, Google maintained its objection to the Government’s re-quests. Before the Court is a motion to compel Google to comply with the modified subpoena, namely, for a sample of 50,000 URLs from Google’s search index and 5,000 search queries entered by Google’s users from Google’s query log.
III. STANDARDS
Rule 45 of the Federal Rules of Civil Procedure governs discovery of nonparties by [*8] subpoena. FED. R. CIV. P. 45 (“Rule 45″). The Advisory Committee Notes to the 1970 Amendment to Rule 45 state that the “scope of discovery through a subpoena is the same as that applicable to Rule 34 and other discovery rules.” Rule 45 advisory committee’s note (1970). Under Rule 34, the rule governing the production of documents between parties, the proper scope of dis-covery is as specified in Rule 26(b). FED. R. CIV. P. 34. See also Heat & Control, Inc. v. Hester Industries. Inc., 785 F.2d 1017 (Fed. Cir. 1986) (“rule 45(b)(1) must be read in light of Rule 26(b)”); Exxon Shipping Co. v. U.S. Dept. of Interior, 34 F.3d 774, 779 (9th Cir. 1994) (applying both Rule 26 and Rule 45 standards to rule on a motion to quash subpoena).
Rule 26(b), in turn, permits the discovery of any non-privileged material “relevant to the claim or defense of any party,” where “relevant information need not be admissible at trial if the discovery appears reasonably calculated to lead to the discovery of admissible evidence.” Rule 26(b)(l). Relevancy, for the purposes of discovery, is defined [*9] broadly, although it is not without “ultimate and necessary boundaries.” Pacific Gas and Elec., Co. v. Lynch, No. C-01-3023 VRW, 2002 WL 32812098, at * 1 (N.D. Cal. August 19, 2002) (citing Hickman v. Taylor, 329 U.S. 495, 507 (1947)).
Rule 26 also specifies that “[a]ll discovery is subject to the limitations imposed by Rule 26(b)(2)(i), (ii), and (iii)” which requires that discovery methods be limited where:

(i) the discovery sought is unreasonably cumulative or duplicative, or is obtainable from some source that is more convenient, less burdensome, or less expensive; (ii) the party seeking discovery has had am-ple opportunity by discovery in the action to obtain the information sought; or (iii) the burden or expense of the proposed discovery outweighs its likely benefit, taking into account the needs of the case, the amount in controversy, the parties’ resources, the importance of the issues at stake in the litigation, and the importance of the proposed discovery in resolving the issues.

The Advisory Committee Notes to the 1983 amendments to Rule 26 state that “[t]he objective is to guard against redun-dant or disproportionate [*10] discovery by giving the court authority to reduce the amount of discovery that may be directed to matters that are otherwise proper subjects of inquiry.” However, the commentators also caution that “the court must be careful not to deprive a party of discovery that is reasonably necessary to afford a fair opportunity to defend and prepare the case.” Rule 26 advisory committee’s note (1983).
In addition to the discovery standards under Rule 26 incorporated by Rule 45, Rule 45 itself provides that “on timely motion, the court by which a subpoena was issued shall quash or modify the subpoena if it . . . subjects a person to undue burden.” Rule 45(3)(A). Of course, “if the sought-after documents are not relevant, nor calculated to lead to the discovery of admissible evidence, then any burden whatsoever imposed would be by definition undue.’” Compaq Com-puter Corp. v. Packard Bell Elec., Inc., 163 F.R.D. 329, 335-36 (N.D. Cal. 1995). Underlying the protections of Rule 45 is the recognition that “the word non-party’ serves as a constant reminder of the reasons for the limitations that charac-terize third-party’ discovery.” Dart Indus. Co. v. Westwood Chem. Co., 649 F.2d 646, 649 (9th Cir. 1980) [*11] (cita-tions omitted). Thus, a court determining the propriety of a subpoena balances the relevance of the discovery sought, the requesting party’s need, and the potential hardship to the party subject to the subpoena. Heat & Control, 785 F.2d at 1024.
IV. DISCUSSION
Google primarily argues that the information sought by the subpoena is not reasonably calculated to lead to evi-dence admissible in the underlying litigation, and that the production of information is unduly burdensome. The Court discusses each of these objections in turn, as well as the Court’s own concerns about the potential interests of Google’s users.

A. Relevance
Any information sought by means of a subpoena must be relevant to the claims and defenses in the underlying case. More precisely, the information sought must be “reasonably calculated to lead to admissible evidence.” Rule 26(b). This requirement is liberally construed to permit the discovery of information which ultimately may not be admissible at trial. Overbroad subpoenas seeking irrelevant information may be quashed or modified. See, e.g., Moon v. SCP Pool Corp., 232 F.R.D. 633, 637 (C.D. Cal. 2005) (quashing [*12] subpoena seeking the production of all purchasing infor-mation where the underlying contract dispute was limited to a particular geographic region); W.E. Green v. Baca, 219 F.R.D. 485, 490 (C.D. Cal. 2003) (providing a survey of cases where in limiting the scope of a subpoena, district courts “effectively sustain[] an objection that the requests are vague, ambiguous, or overbroad in part, and overrules in part”).
This Court does not have the benefit of involvement with the underlying litigation. The Court adheres to the princi-ple stated in Truswal Systems Corp. v. Hydro-Air Engineering, Inc., 813 F.2d 1207, 1211-12 (Fed. Cir. 1987): “A dis-trict court whose only connection with a case is supervision of discovery ancillary to an action in another district should be especially hesitant to pass judgment on what constitutes relevant evidence thereunder. Where relevance is in doubt . . . the court should be permissive.”
However, the Court does not construe a general policy of permissiveness to require this Court to abdicate its re-sponsibility to review a subpoena under the Federal Rules when presented with a motion to compel. The Court has re-viewed the [*13] decisions comprising the lengthy procedural history of this case in the Eastern District of Pennsyl-vania, the Third Circuit, and the Supreme Court, as well as Plaintiffs’ current complaint. The Court has heard the parties at oral argument n2 and proceeds to consider the merits of the Government’s motion.

n2 Counsel for Plaintiffs also appeared at the Court’s hearing on the Government’s Motion to Compel.

1. Sample of URLs
As narrowed by negotiations with Google and through the course of this Miscellaneous Action, the Government now seeks a sample of 50,000 URLs from Google’s search index. In determining whether the information sought is reasonably calculated to lead to admissible evidence, the party seeking the information must first provide the Court with its plans for the requested information. See Northwestern Memorial v. Ashcroft, 362 F.3d 923, 931 (7th Cir. 2004). The Government’s disclosure of its plans for the sample of URLs is incomplete. The actual methodology disclosed in the [*14] Government’s papers as to the search index sample is, in its entirety, as follows: “A human being will browse a random sample of 5,000-10,000 URLs from Google’s index and categorize those sites by content” (Supp. Decl. of Phil-lip B. Stark, Ph.D (“Supp. Stark Decl.”) P 4) and from this information, the Government intends to “estimate . . . the aggregate properties of the websites that search engines have indexed.” (Government’s Reply Memorandum in Support of the Motion to Compel Compliance with Subpoena Duces Tecum (“Reply”), Docket Item No. 21 at 4:8-9.) The Gov-ernment’s disclosure only describes its methodology for a study to categorize the URLs in Google’s search index, and does not disclose a study regarding the effectiveness of filtering software. Absent any explanation of how the “aggregate properties” of material on the Internet is germane to the underlying litigation n3, the Government’s disclosure as to its planned categorization study is not particularly helpful in determining whether the sample of Google’s search index sought is reasonably calculated to lead to admissible evidence in the underlying litigation.

n3 Whether adult material exists on the Internet could not seriously be contested by Plaintiffs with web con-tent describing the slang terms “teabagging” and “pearl necklace” in graphic detail (FAC at 43), or websites which contain “numerous photographs of nude men and women in sexual poses with one another, and erotic sto-ries that include graphic sexual scenes” (FAC at 34). Such a reading of the Complaint is also supported by the narrow question posed by the Supreme Court to be answered on remand for trial on the merits.

[*15]
Based on the Government’s statement that this information is to act as a “test set for the study” (Reply at 3:20) and a general statement that the purpose of the study is to “evaluate the effectiveness of content filtering software,” (Reply at 3:2-5) the Court is able to envision a study whereby a sample of 50,000 URLs from the Google search index may be reasonably calculated to lead to admissible evidence on measuring the effectiveness of filtering software. In such a study, the Court imagines, the URLs would be categorized, run through the filtering software, and the effectiveness of the filtering software ascertained as to the various categories of URLs. The Government does not even provide this rudimentary level of general detail as to what it intends to do with the sample of URLs to evaluate the effectiveness of filtering software, and at the hearing neither confirmed nor denied the Court’s speculations about the study. n4 In fact, the Government seems to indicate that such a study is not what it has in mind: “[t]he government seeks this information only to perform a study, in the aggregate, of trends on the Internet” (Reply at 1:19-20) (emphasis added), with no expla-nation [*16] of how an aggregate study of Internet trends would be reasonably calculated to lead to admissible evidence in the underlying suit where the efficacy of filtering software is at issue.

n4 The lack of disclosure on the part of the government is particularly striking when seen in the context of the time that the Government has had to prepare this issue. The Supreme Court’s directive to the Government to address the effectiveness of filtering software was issued in 2004. Additionally, this is not a case where the Gov-ernment does not have the benefit of any information with which to form some basic methodology — the Gov-ernment has already been to the pond and fished, so to speak, with data from AOL, Yahoo, and Microsoft, and it would not have been unreasonable at this stage to have required the Government to assist the Court in its deter-mination of relevance by providing the Court with more information on its plans for the information sought from Google.

As the court in Northwestern Memorial colorfully noted, [*17] “and of course, pretrial discovery is a fishing ex-pedition and one can’t know what one has caught until one fishes [b]ut Fed.R.Civ.P. 45(c) allows the fish to object, and when they do so the fisherman has to come up with more,” 362 F.3d at 931 — it is difficult for a court to determine the relevance of information where the party seeking the information does not concretely disclose its plans for the informa-tion sought. Given the broad definition of relevance in Rule 26, and the current narrow scope of the subpoena, despite the vagueness with which the Government has disclosed its study, the Court gives the Government the benefit of the doubt. The Court finds that 50,000 URLs randomly selected from Google’s data base for use in a scientific study of the effectiveness of filters is relevant to the issues in the case of ACLU v. Gonzales. n5

n5 To the extent that the Government is gathering this information for some other purpose than to run the sample of Google’s search index through various filters to determine the efficacy of those filters, the Court would take a different view of the relevance of the information. For example, the Court would not find the in-formation relevant if it is being sought just to characterize the nature of the URL’s in Google’s database.

[*18]
2. Search Queries
In its original subpoena the Government sought a listing of the text of all search queries entered by Google users over a two month period. As defined in the Government’s subpoena, “queries” include only the text of the search string entered by a user, and not “any additional information that may be associated with such a text string that would identify the person who entered the text string into the search engine, or the computer from which the text string was entered.” (Subpoena at 4.) The Government has narrowed its request so that it now seeks only a sample of 5,000 such queries from Google’s query log. The Government discloses its plans for the query log information as follows: “A random sam-ple of approximately 1,000 Google queries from a one-week period will be run through the Google search engine. A human being will browse the top URLs returned by each search and categorize the sites by content.” (Supp. Stark Decl. P 3.) To the extent that the URLs obtained by the researchers as a result of running the search queries provided are then used to create “a sample of a relevant population of websites that can be categorized and used to test filtering software” [*19] (Reply at 5) similar to the sample created from URLs from Google’s search index, the Court finds that were the-Government to run these URLs through the filtering software and analyze the results, the information sought would be reasonably calculated to lead to admissible evidence.
Google’s arguments challenging the relevance of the search queries to the Government’s study center around its contention that a number of additional factors exist which may mitigate the correlation between a search query and the search result. (Google’s Opposition to the Government’s Motion to Compel (“Opp.”), Docket Item No. 12 at 6:9-8:1.) In particular, Google cites to the presence of a safe search filter, customized searches, or advanced preferences all poten-tially activated at the user end and not reflected in the user’s search string. (Opp. at 6:17-7:2.) Google also argues that the list of search queries does not distinguish between sources of the queries such as adults, minors, automatic queries generated by a program, known as “bot” queries, and artificial queries generated by individual users. (Opp. at 7:3-22.) Contrary to Google’s belief, the broad standard of relevance under Rule 26 does not [*20] require that the information sought necessarily be directed at the ultimate fact in issue, only that the information sought be reasonably calculated to lead to admissible evidence in the underlying litigation. See Laxalt v. McClatchy, 809 F.2d 885, 888 (D.C. Cir. 1987) (holding that “mere relevance to the underlying litigation” is the proper standard to apply to discovery of certain FBI files). Thus, the presence of these additional factors may impact the probative value of the Government’s expert report in the Eastern District of Pennsylvania on the effectiveness of filtering software in preventing minors from accessing “harmful to minors” material on the Internet, but at this stage, the Court does not find the search queries to be entirely irrelevant to the creation of a test set on which to test the effectiveness of search filters in general.

B. Undue Burden
This Court is particularly concerned anytime enforcement of a subpoena imposes an economic burden on a non-party. Under Rule 45(3)(a), a court may modify or quash a subpoena even for relevant information if it finds that there is an undue burden on the non-party. Undue burden to the non-party is [*21] evaluated under both Rule 26 and Rule 45. See Exxon Shipping Co. v. U.S. Dept. of Interior., 34 F.3d 774, 779 (9th Cir. 1994).
1. Technological Burden of Production
Google argues that it faces an undue burden because it does not maintain search query or URL information in the ordinary course of business in the format requested by the Government. (Opp. at 16:22-15.) As a general rule, non-parties are not required to create documents that do not exist, simply for the purposes of discovery. Insituform Tech., Inc. v. Cat Contracting, Inc., 168 F.R.D. 630, 633 (N.D. Ill. 1996). In this case, however, Google has not represented that it is unable to extract the information requested from its existing systems. Google contends that it must create new code to format and extract query and URL data from many computer banks, in total requiring up to eight full time days of engineering time. Because the Government has agreed to compensate Google for the reasonable costs of production, and given the extremely scaled-down scope of the subpoena as modified, the Court does not find that the technical bur-den of production excuses Google from complying with the [*22] subpoena. Later in this Order, the Court addresses other concerns with respect to this information, however.
Google also argues that even if the Government compensates Google for its engineering time, if the Government plans on executing a high volume of searches on Google, such searches would lead to an interference with Google’s search engine and disrupt use by users and advertisers. (Opp. at 16:24-17:3.) The Government only intends to run 1,000 to 5,000 of the search queries through the Google search engine. (Supp. Stark Decl. P 4.) Furthermore, these searches will be run by humans who will then categorize the search results and record their findings. (Supp. Stark Decl. P 4.) Given the volume and rate of the proposed study, the Court finds that the additional burden on Google’s search engine caused by the Government’s study as represented to the Court, is likely to be de minimus.
2. Potential for Loss of User Trust
Google also argues that it will be unduly burdened by loss of user trust if forced to produce its users’ queries to the Government. Google claims that its success is attributed in large part to the volume of its users and these users may be attracted to its search [*23] engine because of the privacy and anonymity of the service. According to Google, even a perception that Google is acquiescing to the Government’s demands to release its query log would harm Google’s busi-ness by deterring some searches by some users. (Opp. at 18.)
Google’s own privacy statement indicates that Google users could not reasonably expect Google to guard the query log from disclosure to the Government. Google’s privacy statement at www. google.com/privacypolicy.html states only that Google will protect “personal information” of users. “Personal information” is expressly defined for users at www.google.com/privacy_faq.html as “information that you provide to us which personally identifies you, such as your name, email address or billing information, or other data which can be reasonably linked to such information by Google.” (Second Decl. of Joel McElvain, Ex. C.) Google’s privacy policy does not represent to users that it keeps con-fidential any information other than “personal information.” Neither Google’s URLs nor the text of search strings with “personal information” redacted, are reasonably “personal information” under Google’s stated privacy policy. Google’s [*24] privacy policy indicates that it has not suggested to its users that non-”personal information” such as that sought by the Government is kept confidential.
However, even if an expectation by Google users that Google would prevent disclosure to the Government of its users’ search queries is not entirely reasonable, the statistic cited by Dr. Stark that over a quarter of all Internet searches are for pornography (Supp. Stark Decl. P 4), indicates that at least some of Google’s users expect some sort of privacy in their searches. n6 The expectation of privacy by some Google users may not be reasonable, but may nonetheless have an appreciable impact on the way in which Google is perceived, and consequently the frequency with which users use Google. Such an expectation does not rise to the level of an absolute privilege, but does indicate that there is a potential burden as to Google’s loss of goodwill if Google is forced to disclose search queries to the Government.

n6 At the hearing, the Government argued that Google should not be concerned about loss of user trust be-cause Google already discloses its users’ search queries on Google Zeitgeist. Had the Government truly believed that substantial amounts of search query information could be obtained from Google Zeitgeist, it is unlikely that the Government would require further search query information from Google. On the Court’s examination of Google Zeitgeist at http://www.google.com/press/zeitgeist.html, the website only provides the top ten search queries by country or the top fifteen gaining search queries in the United States. These queries for the Week of March 13, 2006, include “teri hatcher,” “world baseball classic,” and “sopranos.”

[*25]
3. Trade Secret
Rule 45(c)(3)(B) provides additional protections where a subpoena seeks trade secret or confidential commercial information from a nonparty. Once the nonparty shows that the requested information is a trade secret or confidential commercial information, the burden shifts to the requesting party to show a “substantial need for the testimony or mate-rial that cannot be otherwise met without undue hardship and assures that the person to whom the subpoena is addressed will be reasonably compensated.” Rule 45(c)(3)(B). Upon such a showing, “the court may order appearance or produc-tion only upon specified conditions.” Id. See also Klay v. Humana, 425 F.3d 977, 983 (11th Cir. 2005); Heat & Control, Inc. v. Hester Industries, Inc., 785 F.2d 1017, 1025 (Fed. Cir. 1986).
a. Search Index and Query Log as Trade Secrets
Trade secret or commercially sensitive information must be “important proprietary information” and the party chal-lenging the subpoena must make “a strong showing that it has historically sought to maintain the confidentiality of this information.” Compaq Computer Corp. v. Packard Bell Elec., Inc., 163 F.R.D. 329, 338 (N.D. Cal. 1995). [*26] A statistically significant sample of Google’s search index and Google’s query log would have independent economic value from not being known generally to the public. The disclosure of a statistically significant sample of Google’s search index or query log may permit competitors to estimate information about Google’s indexing methods or Google’s users. (Decl. of Matt Cutts (“Cutts Decl.”) PP 26, 27.) By declaration, Google represents that it does not share this in-formation with third parties and it has security procedures to maintain the confidentiality of this information. (Cutts Decl. PP 29-35; Decl. of Marty Lev.)
At oral argument, counsel for Google acknowledged that samples from its proprietary search index and query log of 50,000 URLs and 5,000 search queries are far less likely to lead to trade secret disclosure than the Government’s origi-nal requests. Because Google still continues to claim information about its entire search index and entire query log as confidential, the Court will presume that the requested information, as a small sample of proprietary information, may be somewhat commercially sensitive, albeit not independently commercially sensitive. Successive disclosures, [*27] whether in this lawsuit or pursuant to subsequent civil subpoenas, in the aggregate could yield confidential commercial information about Google’s search index or query log.
b. Entanglement in the Underlying Litigation
Google’s remaining trade secret argument is that despite the narrowness of the sample provided, it would become entangled in the underlying litigation where further discovery would risk trade secret disclosure. Rule 45(c)(3)(B) was intended to provide protection for the intellectual property of nonparties. See Mattel, Inc. v. Walking Mountain Prod., 353 F.3d 792, 814 (9th Cir. 2003) (citing Rule 45 advisory committee’s notes (1991)). On the one hand, a determination of the propriety of further discovery is for another set of motions, and not the one presently before the Court. On the other hand, further discovery in this case that would require disclosure of Google’s trade secrets is not merely a remote possibility. The Government has represented that it has sufficient information from other search engines with which to perform its study, but seeks information from Google because such information would add “substantial luster” to its study — ostensibly [*28] because there is something unique about the world of Google. The nature and extent of that uniqueness, if sufficient to add substantial luster to the Government’s study, is also likely to be a matter of discovery for Plaintiffs in the underlying suit involving more than the Government’s proposed “fifteen-minute deposition” of a Google engineer to confirm that the statistician’s procedure had been followed.
In light of the comments of Plaintiffs’ counsel at the hearing, the Court can foresee further entanglement based on Plaintiffs’ challenge to the Government’s ultimate study. In litigation where the ultimate question is not whether there is adult material on the Internet, but fundamentally about limiting the access by minors to such adult material, it is quite likely that Plaintiffs will challenge the sample produced by Google as not representative of what minors search for or encounter on the Internet. Such an inquiry would require additional discovery, some of which may implicate Google’s confidential commercial information. At the hearing, Plaintiffs’ counsel stated that it had already commenced such dis-covery with respect to a search engine included in the Government’s study. [*29] In other words, this Court is con-cerned that a narrow sample of Google’s proprietary index and query log, while in itself not likely to lead to the disclo-sure of confidential information, may act as the thin blade of the wedge in exposing Google to potential disclosure of its confidential commercial information.
c. Substantial Need
The burden thus shifts to the Government to demonstrate that the requested discovery is relevant and essential to a judicial determination of its case. See Upjohn Co. v. Hygieia Biological Laboratories, 151 F.R.D. 355, 358 (E.D. Cal. 1993). Because “there is no absolute privilege for trade secrets and similar confidential information,” Centurion Indus., Inc. v. Warren Steurer and Assoc, 665 F.2d 323, 325 (10th Cir. 1981) (citing Federal Open Market Committee v. Merrill, 443 U.S. 340, 362 (1979)), the district court’s role in this inquiry is to balance the need for the trade secrets against the claim of injury resulting from disclosure. Heat & Control, 785 F.2d at 1025. The determination of substantial need is particularly important in the context of enforcing a subpoena when discovery [*30] of trade secret or confiden-tial commercial information is sought from non-parties. See Mattel, 353 F.3d at 814.
Google contends that it should not be compelled to produce its search index or query log because the information sought by the Government is readily available from open URL databases such as Alexa and transparent search engines such as Dogpile, or that the Government already has sufficient information from AOL, Yahoo, and Microsoft. As a rule, information need not be dispositive of the entire issue disputed in the litigation in order to be discoverable by subpoena. See Compaq, 163 F.R.D. at 333 n.25. In Compaq, industry practice was a material issue in the lawsuit, and the court refused to quash a subpoena for information from a non-party industry member based on the non-party’s argument that information could be discoverable from other industry members. Id. Similarly, at oral argument, the Government’s counsel likened its discovery goals to a team of researchers studying an elephant by separately viewing the trunk, the ears, the tail, etc., and piecing the research together to get a picture of the elephant as whole.
In this case, [*31] the Government has demonstrated a substantial need for some information from Google in cre-ating a set of URLs to run through filtering software. It is uncontested that Google is the market leader with over 45% of the search engine market. (Supp. Stark Decl. PP 4-5.) Because Google has the greatest market share, the Government’s study may be significantly hampered if it did not have access to some information from the most often used search en-gine.
4. Cumulative and Duplicative Discovery
What the Government has not demonstrated, however, is a substantial need for both the information contained in the sample of URLs and sample of search query text. Furthermore, even if the information requested is not a trade se-cret, a district court may in its discretion limit discovery on a finding that “the discovery sought is unreasonably cumula-tive or duplicative, or is obtainable from some other source that is more convenient, less burdensome, or less expen-sive.” Rule 26(b)(2)(i). See In re Sealed Case (Medical Records), 381 F.3d 1205, 1215 (D.C. Cir. 2004) (citing the advi-sory committee’s notes to Rule 26 and finding that “the last sentence of Rule 26(b)(1) was added [*32] in 2000 to em-phasize the need for active judicial use of subdivision (b)(2) to control excessive discovery’”). From this Court’s inter-pretation of the Government’s general statements of purpose for the information requested, both the sample of URLs and the set of search queries are aimed at providing a list of URLs which will be categorized and run through the filter-ing software in an effort to determine the effectiveness of filtering software as to certain categories. Both sources of the URL “test set” list seem to be open to the same sorts of criticism by Plaintiffs in the underlying litigation. The content of these objections are not germane to the Court’s determination of whether the information sought is relevant under the broad dictates of Rule 26, but the actual similarity of the two categories of information sought in their presumed utility to the Government’s study indicates that it would be unreasonably cumulative and duplicative to compel Google to hand over both sets of proprietary information. To borrow the Government’s vivid analogy, in order to aid the Government in its study of the entire elephant, the Court may burden a non-party to require production of a picture [*33] of the ele-phant’s tail, but it is within this Court’s discretion to not require a non-party to produce another picture of the same tail.
Faced with duplicative discovery, and with the Government not expressing a preference as to which source of the test set of URLs it prefers, this Court exercises its discretion pursuant to Rule 26(b)(2) and determines that the marginal burden of loss of trust by Google’s users based on Google’s disclosure of its users’ search queries to the Government outweighs the duplicative disclosure’s likely benefit to the Government’s study. Accordingly, the Court grants the Gov-ernment’s motion to compel only as to the sample of 50,000 URLs from Google’s search index.

C. Protective Order
As trade secret or confidential business information, Google’s production of a list of URLs to the Government shall be protected by protective order. Generally, “the selective disclosure of protectable trade secrets is not per se unreason-able and oppressive,’ when appropriate protective measures are imposed.” Heat & Control, 785 F.2d at 1025. The Court recognizes that Google was unable to negotiate the particular provisions of the protective order [*34] in the underlying litigation, (Opp. at 12:15-18) but since Google’s filing of its Opposition, the Government has considerably narrowed its request for Google’s information from its proprietary search index such that the risk of trade secret disclosure is substan-tially mitigated.
The Court grants the motion to compel as to a set of 50,000 URLs from Google’s search index and orders the par-ties to show cause, if any, on or before April 3, 2006, why a designation of the produced information as “Confidential” under the existing protective order is insufficient protection for Google’s confidential commercial information.

D. Privacy
The Court raises, sua sponte, its concerns about the privacy of Google’s users apart from Google’s business good-will argument. In Gill v. Gulfstream Park Racing Assoc., the First Circuit held that “considerations of the public inter-est, the need for confidentiality, and privacy interests are relevant factors to be balanced” in a Rule 26(c) determination regarding the subpoena of documents used to prepare an allegedly defamatory report issued by a non-party trade asso-ciation. 399 F.3d 391, 402 (1st Cir. 2005) (citing, as also concerned [*35] with the interest of privacy in the context of discovery, Seattle Times Co. v. Rhinehart, 467 U.S. 20, 35 n.21 (1984), In re Sealed Case (Medical Records), 381 F.3d at 1215, and Ellison v. Am. Nat’1 Red Cross, 151 F.R.D. 8, 11 (D.N.H. 1993)).
The Government contends that there are no privacy issues raised by its request for the text of search queries be-cause the mere text of the queries would not yield identifiable information. Although the Government has only re-quested the text strings entered (Subpoena at 4), basic identifiable information may be found in the text strings when users search for personal information such as their social security numbers or credit card numbers through Google in order to determine whether such information is available on the Internet. (Cutts Decl. PP 24-25.) The Court is also aware of so-called “vanity searches,” where a user queries his or her own name perhaps with other information. Google’s capacity to handle long complex search strings may prompt users to engage in such searches on Google. (Cutts Decl. P 25.) Thus, while a user’s search query reading “[user name] Stanford glee club” may not raise [*36] serious privacy concerns, a user’s search for “[user name] third trimester abortion san jose,” may raise certain privacy issues as of yet unaddressed by the parties’ papers. This concern, combined with the prevalence of Internet searches for sexually explicit material (Supp. Stark Decl. P 4) — generally not information that anyone wishes to reveal publicly — gives this Court pause as to whether the search queries themselves may constitute potentially sensitive information.
The Court also recognizes that there may a difference between a private litigant receiving potentially sensitive in-formation and having this information be produced to the Government pursuant to civil subpoena. The interpretation of the Federal Rules in this Circuit requires that “when the government is named as a party to an action, it is placed in the same position as a private litigant, and the rules of discovery in the Federal Rules of Civil Procedure apply.” Exxon Shipping, 34 F.3d at 776 n.4. However, in Exxon Shipping, the Ninth Circuit was faced with a situation where a litigant sought discovery from the Government; in this case, information is being produced to the Government. [*37] Even though counsel for the Government assured the Court that the information received will only be used for the present litigation, it is conceivable that the Government may have an obligation to pursue information received for unrelated litigation purposes under certain circumstances regardless of the restrictiveness of a protective order. n7 The Court expressed this concern at oral argument as to queries such as “bomb placement white house,” but queries such as “communist berkeley parade route protest war” may also raise similar concerns. In the end, the Court need not express an opinion on this issue because the Government’s motion is granted only as to the sample of URLs and not as to the log of search queries.

n7 “Says the DOJ’s [spokesperson Charles] Miller, “I’m assuming that if something raised alarms, we would hand it over to the proper [authorities].” (Decl. of Ashok Ramani, Ex. B. “Technology: Searching for Searches,” Newsweek, Jan. 30, 2006.) (second alteration in original)

E. Electronic [*38] Communications Privacy Act
The Court also refrains from expressing an opinion on the applicability of the Electronic Communications Privacy Act, codified at 18 U.S.C. * § * § 2510 to 2712. The ECPA was enacted in 1986 “to update and clarify federal privacy protections and standards in light of dramatic changes in new computer and telecommunication technologies.” Freed-man v. America Online, Inc., 303 F. Supp. 2d 121, 124 (D. Conn. 2004) (quoting 132 CONG. REC. S. 14441 (1986)). See also Theofel v. Fare-Jones, 359 F.3d 1066, 1071 (9th Cir. 2004). The Court only notes that the ECPA does not bar the Government’s request for sample of 50,000 URLs from Google’s index though civil subpoena.
V. CONCLUSION
As expressed in this Order, the Court’s concerns with certain aspects of the Government’s subpoena have been miti-gated by the reduced scope the Government’s present requests. Nothing in this Order is intended to indicate how the Court would rule on the original broad subpoena or on any follow-up subpoena. The Court’s decision on this Motion to Compel reflects the limited use to which the Government intends to put the information produced [*39] in response to the subpoena. In particular, this Order does not address the Plaintiffs’ concern articulated at the hearing about the appro-priateness of the Government’s use of the Court’s subpoena power to gather and collect information about what indi-viduals search for over the Internet.
With these limitations, for the reasons stated in this Order, unless the parties agree otherwise on or before April 3, 2006, Google is ordered to confer with the Government to develop a protocol for the random selection and afterward immediate production of a listing of 50,000 URLs in Google’s database on the following conditions:
1. In the development or implementation of the protocol, Google shall not be required to disclose proprietary in-formation with respect to its database;
2. The Government shall pay the reasonable cost incurred by Google in the formulation and implementation of the extraction protocol;
3. Any information disclosed in response to this Order shall be subject to the protective order in the underlying case;
To the extent the motion seeks an order compelling Google to disclose search queries of its users the motion is DENIED. The Court retains jurisdiction to enforce this [*40] Order.

Dated: March 17, 2006
/s/ James Ware
United States District Judge