[00:36:13] <mattgordon> I'm on Mongo 1.8 and I'm trying to get all documents for which a field exists and is not null. What is the correct way to do this? I'm having trouble finding old docs and the information for the current version seems to be incorrect for my version
[00:50:22] <Auz> mattgordon, does $exists work for 1.8?
[00:55:32] <mattgordon> yeah that should do it. thank you! i'm seeing very different behavior between $exists: true, $nin: [null], and $not: {$in: [null}} so I was getting pretty frustrated
[00:56:00] <mattgordon> I assume 10gen cleaned it up for the new releases but that makes it difficult to find docs that don't secretly break for edge cases ;)
[01:08:39] <jaimef> so trying to split a large RS by adding 4 more systems to the rs, then setting fsynclock on, shutting down half of the servers, and then change the replicaset name of those left running. shutdown and startup other group. is there an easier way to do this?
[02:50:04] <w3pm> if i have a process I want to read/write to a mongodb database, what's the "proper" way of setting this up securely so that no other processes on the network can access the db?
[07:57:52] <strata> using the same version of pymongo (2.4.1) with mongo 2.2.2 on 3 different distros (ubuntu, fedora, arch), i create a record that looks like this: [{stuff: 'hi', somestuff: [{morestuff: 'hi', evenmorestuff: 'hi'}]}]
[07:59:22] <strata> when i pull that record out on arch (stuff = db.whatever.find()), stuff = that record exactly. on all other distros, it gets encapped like this {0: {the original record}}
[08:01:17] <strata> my only guess is the arch devs enabled some kind of quirk at compile time because it seems the other distros exhibit the later behavior (which to me is faulty anyway but that's my opinion)
[08:04:00] <strata> yes. actually this is a mongo thing. db.something.save([{some stuff}]) saves as { "_id" : ObjectId("..."), "0" : {somestuff}}
[08:14:42] <oskie_> on the primary node (the oldest node) in the replica set the disk usage is 350G. On another it is 280G. What's all that wasted space? Is there a way to reclaim it?
[08:41:39] <tomlikestorock> is there a way to elemmatch on items in a document and items in a list in that document at the same time?
[08:45:18] <tomlikestorock> that is, elemMatch on a field in a document, and one of the sub attributes of that field is a list, and specify conditions to elemmatch on items in that list as well
[09:21:57] <kali> solars: 2.3.2 is not for production use
[09:21:59] <chovy> i didn't create a user/pass or anything I figured it would only accept local connections. But the past 2 months my db has been accessible to anyone.
[09:22:11] <solars> kali, yes I know, I just want to know if I can replace it for testing or not
[09:23:01] <kali> chovy: well, it's good you tried mongohub :)
[09:23:22] <solars> by not ready for production use you mean it's not guaranteed to be stable, right - or are there any performance issues because of debug outputs etc?
[09:24:01] <chovy> so how to i lock this thing down so the world can't read my db?
[09:24:40] <kali> solars: i think it's the same compilation options as the stable builds, so there should not be huge performance issues
[09:37:53] <NodeX> the reason it's great and fast (not the main reason but partly) is because it takes all these things that a DB shouldn't really handle out of the equation
[09:38:14] <kali> chovy: i don't know... i have this bad habit of spending hours reading on a techno before deciding to download it, so sorry, but i can't relate there
[09:38:44] <NodeX> personaly I read the config file and locked it down to localhost before starting it ever
[09:38:54] <kali> chovy: it is also the first bullet point on the production notes page
[09:41:59] <kali> chovy: i loke to be able to run a mongod --dbpath . & in a shell and just use it as a sandbox to demo, trials and tests... that is not possible to do when configuring gets in the way
[09:42:01] <NodeX> https://gist.github.com/0699bbecc2ee864ce439 <--- easy firewall script ..... put it in a .sh, execute it, once happy use iptables-save
[09:42:30] <chovy> ok, at least i can't connect remotely anyore
[09:42:51] <Gargoyle> chovy: From a sys admin point of view, you should assume "This shit is open to the world". If you give a server an internet facing IP address without a filewall, it's like leaving your front door wide open while you pop to the shop for milk.
[09:46:53] <NodeX> SQLi = whole server compramised
[09:47:06] <chovy> Gwayne: yeah, that's what i've always done, but mysql by default doesn't allow remote connections. I assumed this worked the same way.
[09:47:30] <chovy> oh, what's the tunnel trick to connect with mongohub?
[09:47:44] <oskie> is there a way to speed up oplog recovery? It seems the initial data replication to a new node is quite fast, but when it comes to oplog replay it takes 30s to apply 1min worth of changes
[09:48:56] <NodeX> Gargoyle : root sql account (unless you disable it) has exec privs by default on mysql which means a simple SQL injection wouldv'e rooted your entire server
[09:49:09] <oskie> (i guess it is all the indexes slowing it down)
[09:49:21] <Gargoyle> NodeX: So where's the problem?
[10:01:50] <Gargoyle> In the main Host/port field, I would put localhost:27017, then Bind Address = 127.0.0.1 27017 (Assuming you dont have a local mongo instance)
[10:02:02] <Gargoyle> And SSH Host = your servers real IP
[10:25:33] <Gargoyle> Oh… NodeX - You can't just go giving away answers! ;)
[10:25:37] <NodeX> quick question which is confusing me... say I have an images collection and normaly I lookup on gid (gallery_id) to get a list of all images in a gallery ... now I want to adapt my query to sort by _id and put a fed _id at the top but still return a list of all images in the gallery but with my "fed" id being the first in the list .. can this be done (not somehting I have ever tried but one
[10:29:36] <chovy> i have not found anything related to migrating mongo data.
[10:29:53] <NodeX> this is where the line gets blurred from relational mappers - they dont really have a place in Mongo imho
[10:29:54] <Gargoyle> chovy: Mongo is that way by design, so yes - in code you need logic to check for the existence of a specific sub doc before you use it.
[10:29:56] <chovy> which means, if it doesn't have a comments subdoc, then nobody can ever leave a comment.
[10:39:26] <Derick> and you actually get to learn the product you're using :-)
[10:44:37] <solars> kali, any idea if something regarding auth has changed? I've replaced the binary with 2.3.2 and get: Tue Jan 15 11:42:27.503 [conn24] assertion 16550 not authorized for query on history_production.a
[10:44:43] <solars> for an auth that has worked before?
[10:51:08] <Gargoyle> kali: Some people see "apt-get install" or "brew install" and use the commands but never bother reading any further, and wonder why their systems end up in such a mess.
[10:52:26] <Gargoyle> It took me 5 mins to install mogno on OSX - It took another 2 hours reading some basic info about Launch Control.
[10:52:37] <solars> but this discussion is endless, you could also argue that because of this, everyone would have to write assembler :)
[10:52:42] <NodeX> On the flip side of that one can also get caught up in non understanding of things and get to weighed down in "other people's" versions of "the done thing"
[10:52:46] <solars> abstraction isn't only a bad thing
[10:53:33] <kali> solars: it's good to have a good understanding of at least one level behind the one you're using daily
[10:53:43] <NodeX> One doesn't have to be a genious to write fast / efficient web apps - one just has to understand signal flow and have a basic knowledge of syntax
[10:53:49] <oskie> you don't need db.fsyncLock if you are backing up using LVM snapshots and the journal is on the same volume as the data, right?
[10:54:44] <kali> solars: and it's vital to have somebody understanding DBs in a team of application developpers, as it is important to have somebody around who can do a bit of assembly, or understand a tcp dump
[10:57:07] <kali> NodeX: some are realy good... salat in the scala world is small, efficient, focus... and because scala is so statically typed, using mongo directlry is really a pain
[10:57:11] <Gargoyle> I see no positive outcome from using a ORM/ODM that requires ether code annotations or public accessors/methods for everything.
[10:57:20] <kali> NodeX: but for dynamic language, i tend to aggree
[10:57:28] <Gargoyle> Most languages have some notation of serialization.
[10:57:46] <NodeX> I dont use (currently) a typed language, I'm teaching myself C so perhaps one day!
[10:58:09] <NodeX> I hear the argument for teams of people when working on large projects
[10:58:17] <NodeX> and the need for a common way to do things
[10:59:04] <kali> NodeX: well, as all you get from mongodb is basically Map[String,Object] (in scala notation) every time you read a field you need to cast it to the right type
[10:59:11] <NodeX> ANyone using ZeroMQ behind their mongo in here?
[10:59:22] <NodeX> Ah, kali - I do that anyway LOL
[10:59:34] <NodeX> just habit to cast everything when working in PHP !!
[10:59:54] <kali> yeah, i'm still trying to forget about that period of my life :)
[12:07:57] <Derick> it might just not give as much as you think it would....
[12:08:01] <noobie25> i'm having trouble testing out geospatial queries. when i runCommmand: geoNear ... i get results, be clearly there are better results available in proximity. I was storing geo x, y values as Float Double ... is that the underlying problem?
[12:08:18] <kali> limpc: this thing is... most of us (in the community, and i guess in the dev team too) think it is useless
[12:08:23] <Derick> with the lock yielding, collection level locking is likely not be going to give a lot of extra performance
[12:08:53] <NodeX> noobie : can you pastebin your query and your indexes and a sample document?
[12:09:16] <Derick> noobie25: make sure you store things in lon, lat, and not lat, lon
[12:09:17] <limpc> well it should help reduce the number of replica sets needed for the same volume of queries, yes?
[12:09:57] <Derick> limpc: uh? what makes you jump to that conclusion?
[12:10:35] <vikash_> I am trying to insert data in mongo but after the first insertion I get "TypeError: object is not a function" [Source code -> https://gist.github.com/4538178 ] Please help
[12:10:45] <limpc> the main reason for replica sets is to mitigate backlogs from large numbers of inserts (e.g. high user concurrency), correct?
[12:11:54] <kali> limpc: the main drive of replica set is repliation and failover
[12:12:03] <Derick> kali: well, and read performance
[12:12:42] <limpc> sorry, had them flipped. its 6am and I havent slept yet
[12:13:33] <limpc> so i would think that by adding collection level locking, you'd be able to get more concurrency per shard, and reduce the number of shards necessary to support certain volumes
[12:14:07] <Derick> limpc: "think" is not a good benchmarking tool :-)
[12:14:11] <kali> limpc: it's not that clear, because you'll saturate the hardware at some point
[12:15:42] <limpc> hm isnt that like saying you're going to saturate a computer's northbridge before you max out all 6 cores of a 6 core cpu?
[12:17:01] <webber_> [13:07] <kali> limpc: this thing is... most of us (in the community, and i guess in the dev team too) think it is useless + [13:13] <@Derick> limpc: "think" is not a good benchmarking tool :-)
[12:17:18] <kali> webber_: this is helpful, thanks :)
[12:20:43] <noobie25> Derick: hi derick. i verified that values are being stored in the following order : lng, lat
[12:20:59] <limpc> ok well i worked at zynga. one of the reasons we decided to forgo mongo last year was because of the locking. it caused concern where high loads were involved, and we werent convinced its speed justified the cost of the # of servers we'd probably need to replace mysql/memcached. Now I'm at another very high volume business and there's some similar concern.
[12:22:20] <limpc> if there were collection level locks, it'd get alot more attention
[12:22:29] <NodeX> running mongo is certainly less hardware needy than running mysql
[12:23:01] <noobie25> Derick: however, when querying for San Francisco region: (122.41, 37.77) i get results from korea (127.23, 38.31) so i think it has something to do with these lng lat pairs.
[12:24:36] <vikash__> I am trying to insert data in mongo but after the first insertion I get "TypeError: object is not a function" [Source code -> https://gist.github.com/4538178 ] Please help
[12:25:01] <NodeX> vikash__ : it's in your driver , please see if you can paste the raw query sent to mongo
[12:25:10] <noobie25> Derick: thank you! life saver
[12:25:42] <noobie25> i'd rather store my lng, lat at strings ... do you know if mongo supports that datatype as well?
[12:27:08] <NodeX> vikash__ : please make the output json friendly so we can see what's going on
[12:28:46] <vikash__> ok, but In the first time the dataToPush is getting inserted into the collection. Problem is, at the second time its giving me the error
[12:29:01] <vikash__> And I am very new to Node and mongo :)
[12:30:29] <NodeX> perhaps it's a unique key it's trying to overwrite?
[12:31:27] <vikash__> hmm,, how can I get rid of it then?
[12:39:42] <NodeX> all I can think of is that it's trying for ObjectId and using Object instead
[12:39:53] <NodeX> which is why I need the raw query
[12:41:24] <vikash__> what do you mean by raw query? I have an object dataToPush. and when a user sends a message, it updates the fields in dataToPush and I am trying to insert this in mongodb
[12:42:01] <NodeX> the raw query sent to mongo .... I/e what the string looks like when it leaves the driver
[12:43:06] <vikash__> By any chance did you mean this -> result ->[{"name":"Vikash","msg":"Hi","channel":"null","is_spam":false,"is_delete":false,"timestamp":"2013-01-15T12:31:38.054Z","_id":"50f54c44206f74e145000001"}]
[12:43:15] <vikash__> Data -> {"name":"Vikash","msg":"Hi","channel":"null","is_spam":false,"is_delete":false,"timestamp":"2013-01-15T12:31:38.054Z","_id":"50f54c44206f74e145000001"}
[12:43:16] <vikash__> debug - websocket writing 5:::{"name":"add_message","args":["Vikash","I am Vikash"]}
[12:43:26] <vikash__> Data -> {"name":"Vikash","msg":"Hi","channel":"null","is_spam":false,"is_delete":false,"timestamp":"2013-01-15T12:31:38.054Z","_id":"50f54c44206f74e145000001"}
[13:19:13] <Aktau> I'm trying to add journalling but it doesn't recognize the option
[13:19:24] <Aktau> Is it even possible for that version, if somebody knows?
[13:23:18] <vikash__> I edited my JSON instead of dataToPush and made it individual in db.createCollection() and it works now
[13:32:51] <woozly> guys, how to join queries? I have two collections: db.queue( { "myid": 12345, "parent_id": 9126} ); and db.parents( { "myid": 9126, "count": 0 } ). How I can update db.parents, buy finding db.queue by id? I mean: queue.find({ "myid": 12345}) <--- got from this result 'parent_id' and use it, for increasing parents() record :/
[13:34:31] <woozly> In SQL: UPDATE parents SET parents.count = 999 LEFT JOIN queue q ON q.myid = 12345 WHERE parents.myid = q.parent_id; (something like this, if I don't do it wrong.. :)
[15:10:43] <amimknabben> denormalize models and run threads, really?
[15:20:38] <doxavore> What kind of connection pool sizes are people using in multithreaded servers? I've tried everything from 10 to 512 in JRuby and I still run out.
[15:22:24] <mansoor-s> How do I go about searching with values
[15:22:40] <mansoor-s> say I have a value: "MyAwesomeValue" I want all documents that have the term Awesome in it
[15:24:15] <Derick> mansoor-s: you will need a regexp query, but that is not going to use an index
[15:35:04] <MatheusOl> Derick: Not with b-tree, of course
[15:35:18] <MatheusOl> Derick: But on PostgreSQL you can use GIN or GiST indexes for that
[15:40:34] <Derick> gist is not a type of index, but just a data structure (says wikipedia: http://en.wikipedia.org/wiki/GiST) - if you mean their tsearch2 index, then that is just FTS - which needs split up words (http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html) http://www.postgresql.org/docs/8.3/static/textsearch-indexes.html also talks for gist about "each word" and for gin about "GIN indexes store only the words"
[15:41:40] <mansoor-s> is it acceptable to do map reduce-es as a normal function of the application?
[15:41:44] <webber_> LIKE can be answered from a plain RDBMS index if LIKE uses an index prefix, e.g. LIKE "A%".
[15:42:30] <Derick> mansoor-s: M/R won't help you here
[15:42:39] <Derick> webber_: same in Mongo, with /^A/
[15:44:55] <mansoor-s> Derick, are you saying I can do something like what I want in Mongo?
[15:45:05] <mansoor-s> or I have to go with something lime MySQL or Postgres
[15:45:10] <Derick> mansoor-s: you are being really vague about what you want
[15:45:29] <Derick> you can query with a regular expression, it's just not going to use an index
[15:45:31] <MatheusOl> webber_, Derick: What I'm saying is that PostgreSQL can, in some cases, use index with LIKE, even when it comes with suffix: LIKE '%bla%'
[15:45:39] <MatheusOl> Using one of the extensions above
[15:45:57] <MatheusOl> And GIN or GiST indexes (yes they are data structure, and also can be used as indexes)
[15:47:36] <MatheusOl> Derick: I agree, I just mentioned it wouldn't use an index, but it's not always a bad thing. Also, generally, we expect some slowness searching with regular expression (compared to equally searchs)
[15:47:50] <mansoor-s> Derick, We are creating a search functionality for our application. We want to be able to search by value contents.
[15:49:31] <mansoor-s> "this is my string" i want to search the string
[15:49:42] <Derick> mansoor-s: can you provide a proper (slightly) complex document as well as what you're trying to look for?
[15:51:26] <mansoor-s> Book titles. I have a book (document) with Title(key) with "Harry Potter" (Value). When I search Harry, i want this document to be returned
[15:52:08] <mansoor-s> and its not necissarily by individual words either, so it could be Harrypotter and I still want it returned
[16:07:01] <NodeX> and "Harrypotter" cannot be splt on words as it's one word"!
[16:07:43] <Gargoyle> NodeX: You couls still use regex /^harry/ will match and use an index, but the app will have to convert all keywords to lowercase
[16:13:54] <NodeX> unfortunately (at present) mongodb is not suited for a performant text search
[16:14:13] <NodeX> 2.4 will address some issues but it was never meant to be a replacement for an RDBMS with FTM
[16:14:29] <NodeX> or a secondary search index (Lucene, ES etc)
[16:14:33] <MatheusOl> Of course not as fast as prefixed
[16:17:09] <MatheusOl> But I think is a good idea to use a secondary search index, as NodeX said
[16:53:49] <coredump> So, I need to remove a replica member but it is the primary atm, use that procedure to force the primary to move to the other server and then stop/remove the primary?
[17:55:03] <UForgotten> Anyone seen this before? https://jira.mongodb.org/browse/SERVER-8178
[17:59:25] <konr_tra`> Is there a tool to effortless import gigabytes of kinky csv files (valid, but with entries containing newlines, escaped quotes and all sorts of nastiness) to mongo?
[18:04:16] <konr_tra`> oh wait, mongoimport does work with csv :D
[18:34:55] <limpc> hmm how did foursquare get over the 3.2 million collection limitation in mongo?
[18:36:44] <limpc> Our mongo struct is fairly complicated, we were wanting to make each user their own collection for speed. but --nssize is limited to 2047, which is just under 3.26 million collections
[18:37:42] <iggy__> hi, can someone help with an update query?
[19:49:35] <themoebius_> is there a way to reclaim disk space in place yet? i.e. without exporting the whole DB and reinitializing like in a repair?
[19:57:34] <UForgotten> themoebius_: there is a compact function that I read about, but use it at your own risk
[19:57:54] <UForgotten> and it requires a lot of head room to defrag
[19:58:14] <themoebius_> UForgotten: yeah but that doesn't reclaim disk space, it just allows mongo to reuse space it's already claimed
[19:58:42] <themoebius_> I mean my /db partition is like 95% full but the actual data size is far less
[20:16:50] <akeem> hi, I am trying to track down a slow process. its ok when I do an explain on the query but when I profile it takes a good amount of time
[20:17:36] <akeem> if it helps it an $or query with both sides of the or indexed
[20:35:04] <owen1> i have replica set of 3. i killed the primary and new primary was elected. i killed the new one but the last survivor, the 3rd, is still secondary. is it normal?
[20:48:25] <Gargoyle> owen1: The final server will not self promote as it does not know if the other servers are dead, or it IT has been segmented on the network.
[20:48:38] <Gargoyle> But you can manually promote it to a primary
[20:51:13] <Gargoyle> owen1: I can't remember exactly, but this is the general area of the docs. http://docs.mongodb.org/manual/tutorial/reconfigure-replica-set-with-unavailable-members/
[20:54:05] <owen1> Gargoyle: interesting. i wonder what happened if i have rs of 4.
[20:55:16] <Gargoyle> owen1: You shouldn't (can't)
[20:56:04] <owen1> Gargoyle: so stick to 3 or 5. i want to configure my hosts with a bash script. i don't want to run mongo console and type manual commands like rs.initiate, rs.add etc. is it possible and is there an example for doing that?
[20:56:11] <Gargoyle> A rs should have an odd number of members.
[20:56:34] <Gargoyle> owen1: It's all just javascript! :)
[20:57:00] <Gargoyle> make config.js, and then just do "mongo < myconfig.js" type thing
[21:05:57] <owen1> Gargoyle: oh. so u can dump data into the mongodb collections that will setup the replica set etc?
[21:06:22] <owen1> Gargoyle: can u send me links to examples? that's perfect
[21:07:14] <Gargoyle> if you redirect (or pipe) to mongo shell, its the same as if you were running those commands by typing them.
[21:08:32] <Gargoyle> owen1: Eg. create a test.js with the following:-
[21:14:09] <owen1> btw, i noticed that eventough i have replSet = abc in /etc/mongodb.conf it's being ignored, so i had to add it to the mongod commmand.
[21:58:26] <Hoverbear> Hi all, I'm working with Mongoose and have a query result (from like Foo.findById(myId)) and have assigned it some something… But now when I try to use the .id function on a sub doc I'm getting errors that it does not have such a method… Any ideas?
[22:58:57] <jaimef> does restarting a mongod help in anyway catch up on delay?
[23:15:34] <Derick> What's with all the new extra whitespace on the osm.org pages?! Lots of scrolling needed now :-/
[23:16:54] <Zelest> where does osm get all the data from?
[23:48:57] <owen1> what triggers an election for a new primary? only loss of connection to a member or are there more conditions like high cpu/memory/diskspace?