[10:53:48] <_mkrull> hi. i would like to dump a larger mongodb. does the size of the dump have a similar size than the database on the filesystem or do those differ?
[11:01:08] <_mkrull> i have about 33% of the filesystem used.. well i will monitor. another question: i will have to do the dump on the master of a replset as the slaves do not have enough space left. that will impact performace of course.. but does that have other implications as well?
[11:06:19] <_mkrull> and one more thing: does mongodb free space if i delete large parts of the database automatically? if i have a 3TB database and remove 1TB of data will i get that back on the filesystem without running --repair?
[11:56:00] <bartzy> About mongodump and mongoexport, thanks. Can both export index metadata?
[12:44:51] <NodeX> bartzy : read the docs, it's all in there
[13:30:18] <spacepluk> hi, I'm trying to find out if mongodb suits my need for a new application. I need to be able to perform queries in the style of: "documents not in some-list" where some-list can grow significately. What do you think?
[13:32:35] <MANCHUCK> spacepluk, yes you can check out http://docs.mongodb.org/manual/reference/sql-comparison/
[13:34:33] <spacepluk> I've been reading the documentation for a while
[13:35:00] <MANCHUCK> it also depends on how you design your schema
[13:35:39] <NodeX> spacepluk : how large is the list?
[13:36:51] <spacepluk> I need to track which documents a user has seen and be able to filter them
[13:37:11] <spacepluk> so, it has some physical limits but is kind of indefinite
[13:38:43] <spacepluk> the list will grow over time
[13:39:04] <spacepluk> even if I use references I'd need to put the list in the query, right?
[13:53:07] <MANCHUCK> and i have not reached the document limit yet
[13:53:27] <NodeX> 5000 user is nothing, that's probably less than 50kb
[13:54:07] <MANCHUCK> i can find out the largest document
[13:54:23] <MANCHUCK> im sure its not more than 50kb
[13:54:25] <spacepluk> I guess that could work for a while
[13:54:26] <NodeX> it's nowhere near 16mb is my point
[13:54:38] <MANCHUCK> yea thats the point im pointng out
[13:55:05] <spacepluk> that was very helpful, thank you
[13:55:17] <NodeX> spacepluk : at some point the document size won't scale with your app so I would say a graphdb might be more appropriate for the questions
[13:55:37] <spacepluk> any recommendation for graphdbs?
[13:55:53] <NodeX> neo4j is about the most popular I would say
[13:56:03] <ron> though its license may be a bit evil.
[13:56:22] <NodeX> orientdb can do some things liek this too
[14:00:32] <bartzy> Thanks, and I just saw that mongoimport/export doesn't reliably determine data types (I guess dates and such), so I get why one should use mongodump/restore for backups.
[14:01:02] <NodeX> yeh, you can convert them but it's a pain
[14:03:08] <NodeX> just write a thrid party script to parse the exported json
[14:20:00] <BlakeRG> I am adding a document to my collection and tagging it with an added date. The data shows up like this: "added": { "$date": 1383315474000.000000 }
[14:20:38] <BlakeRG> is this suitable for be to be able to query/group by day if i wanted to pull all documents added for a particular day?
[14:23:05] <NodeX> you can do a $gt / $lt to captur the date
[14:23:43] <BlakeRG> are human readable timestamps query-able as well?
[14:50:52] <eldub> kali That's pretty close to what I want. Is there a way I can query the replicaset as a whole? or maybe add --host x.x.x.x --host y.y.y.y etc etc?
[14:52:19] <eldub> kali I can't NOT specify a host because in my mongodb.conf I have it listening on a certain IP. So whe I put that command in, it says it can't connect
[14:52:47] <kali> ha. just add the ip to the command line
[14:53:22] <eldub> yea then I'm only returning a single value
[15:55:05] <eldub> kali so the commands you gave me earlier worked great -- 1 more question. Is there a way to have it output the hostnames along with the "true / false" print?
[15:55:27] <eldub> kali I took what you gave me and put --host `hostname` in there so now in my script it comes back saying "true or false" but no hostname.
[16:00:40] <eldub> looking for assitance on configuring a host as a hidden member
[16:01:14] <NodeX> close your eyes - problem solved haha
[17:46:25] <eyda|mon> Derick: I've got many clients and I'd like for each one to have their own database. I'll also have a central database to manage users that are common between them. Is there any worry about having a db per client with the knowledge I may end up with thousands?
[17:46:54] <Derick> eyda|mon: no need to ask me directly :-)
[17:47:03] <Derick> eyda|mon: you can have hundreds of dbs
[17:47:14] <Derick> but, you need to realize what the storage requirements are
[17:50:08] <flyankur_> Derick: Do you know about any good resource to understand/learn what different kind of schema design I can for for complex uses cases.
[17:51:24] <eyda|mon> Derick: i know I don't need to, but you seemed friendly and helpful so I chose to :)
[17:52:49] <eyda|mon> Derick: would thousands cause an issue? The other options I have are making collections with prefixes to keep the data separate, or have everything in one database and use client_ids to get the data out. My concern with the last one is I won't know how much a client consumes as far as space.
[17:53:15] <flyankur_> Derick: sorry for asking directly :P
[17:53:27] <Derick> eyda|mon: files require filepoints, of which you have a limited amount
[17:53:53] <Derick> ulimit -a will tell you how many
[17:54:04] <Derick> on my dev laptop it's only 1024 (although you can easily change it)
[17:54:54] <eyda|mon> oh ok, so each database requires a new filepointer.
[17:55:54] <eyda|mon> Out of the three options, which would you have chosen? 1. database per client. 2. prefix-collection per client. 3. just one database, shared collection with client_id to get the data.
[20:15:50] <spacepluk> for example, i have documents that have the question field. I'd like to get a list of the questions that meet some criteria but I don't want the result to be a list of { question:'value' }