PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 15th of May, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:04:28] <sabrehagen> GothAlice: so as per your suggestion i've applied 0.0.0.0 in my config file, but when trying to restart the mongodb service i encounter this:
[00:04:29] <sabrehagen> service mongodb restart
[00:04:29] <sabrehagen> stop: Unknown instance:
[00:04:29] <sabrehagen> mongodb start/running, process 66359
[00:04:50] <sabrehagen> how can the service be identified for starting, but not for stopping?
[00:05:56] <StephenLynx> it wasnt running before
[00:05:59] <StephenLynx> imo
[00:06:11] <sabrehagen> it was running
[00:06:21] <sabrehagen> running service mongodb restart always produces this result
[00:06:23] <sabrehagen> every time
[00:06:27] <StephenLynx> hm
[00:06:33] <StephenLynx> let me see what I get
[00:07:02] <StephenLynx> ah
[00:07:06] <StephenLynx> try using mongod
[00:07:10] <StephenLynx> instead mongodb
[00:08:51] <sabrehagen> okay that works, thanks. now i need to check why it's not listening on 0.0.0.0
[00:09:33] <StephenLynx> where are you setting that?
[00:09:41] <sabrehagen> /etc/mongodb.conf
[00:10:00] <StephenLynx> by binding it to 0.0.0.0?
[00:10:55] <sabrehagen> i see there is a /etc/mongod.conf
[00:10:59] <sabrehagen> i'll try it there
[00:12:05] <StephenLynx> yeah, on my install mongodb.conf doesnt exist, only mongod.conf
[00:13:09] <sabrehagen> this was a clean install - i wonder how it got there
[00:45:15] <sabrehagen> so, completely independent of my previous issues, our production server was hard rebooted by scheduled maintenance by our cloud provider and after it restarted, our mongodb data directory is empty. has anybody ever heard of this before? we have backups, but this is seriously bad
[00:46:57] <StephenLynx> daium
[00:47:12] <StephenLynx> no, never heard about that. I think your hosting dun goofed.
[00:47:31] <StephenLynx> what service are you using?
[00:53:58] <sabrehagen> i wouldn't blame the hosting
[00:55:17] <StephenLynx> I don't really think a hard reboot would cause files to just vanish into the void.
[00:55:29] <Spec> depends on the cloud instance and how it's set up
[00:55:33] <sabrehagen> neither would i!
[00:55:36] <StephenLynx> or, the host.
[00:55:46] <Spec> I would expect it for ephemeral instances!
[00:55:51] <sabrehagen> Spec: it's just a vps
[00:56:06] <StephenLynx> which hosting?
[00:56:14] <sabrehagen> rackspace
[05:03:12] <mitereiter> i got a little problem with Tag-Aware Sharding
[05:03:41] <mitereiter> i have a sharded collection test.foo
[05:04:04] <mitereiter> the shard key is { "_id.countryCode" : 1, "_id.recordId" : 1 }
[05:04:55] <mitereiter> i would like to add some ranges like this
[05:05:28] <mitereiter> sh.addTagRange( "test.foo",{ "_id.countryCode": "I", "_id.recordId": MinKey },{ "_id.countryCode": "I", "_id.recordId": MaxKey },"shard0000")
[05:06:31] <mitereiter> i get the following error
[05:06:43] <mitereiter> Error: can't have . in field names [_id.recordId]
[05:07:40] <mitereiter> I've tried the following solutions but none of them worked
[05:07:55] <mitereiter> https://jira.mongodb.org/browse/SERVER-6999
[05:08:06] <mitereiter> after this i got the
[05:08:29] <mitereiter> Error: error: _id.countryCode is not valid for storage.
[05:13:20] <mitereiter> then i tried to encode the dot as \uff0e
[05:14:02] <mitereiter> i could add the ranges this way but the balancer threw this error
[05:14:18] <mitereiter> Last reported error: field names of bound { _id.countryCode: "I", _id.recordId: MinKey } do not match those of keyPattern { _id.countryCode: 1.0, _id.recordId: 1.0 }
[05:14:41] <mitereiter> so thats pretty much where I am now
[05:15:12] <mitereiter> and im using version 3.0.2 on a centos 7 server
[05:17:30] <joannac> mitereiter: last i checked the redefinition of validateForStorage works
[05:18:44] <mitereiter> well I did get a different error that way, but I've tried it twice
[05:21:14] <mitereiter> hm, if I only overwrite the validateForStorage it seems to work
[05:22:04] <mitereiter> before I have overwritten the addTagRange function too
[05:22:32] <mitereiter> tyvm, joannac
[07:58:51] <okanck> joannac: compound index didn't help
[09:34:03] <watmm> I'm seeing tutorials online with a comma separated list of IPs for bind_ip, but my mongo (2.6.6) doesn't seem to like it. Is this even possible or do i have to make it 0.0.0.0 and then restrict with iptables?
[09:57:15] <vagelis> Hello
[09:59:28] <vagelis> I am trying to understand the cursor.hasNext(). What does it mean when in docs says that returns True if the .find() query can iterate further to return more documents? I used this .limit(1) in my .find(). So it returns only 1 so it can't iterate for more right?
[10:00:40] <vagelis> I am new to Mongo btw so I hope I made a clear explanation.
[10:05:06] <vagelis> I guess the .limit() doesnt count.
[10:05:58] <Gevox_> vagelis: hasNext() basically takes a sneek peek one step further from the current location to look if there is any record coming after or not. It returns True when there is actually something next, otherwise false.
[10:07:09] <Gevox_> To iterate over a list you need to call the .find() method and store the "Cursor" it returns into a Cursor object. Then use this cursor object to iterate over the elements
[10:07:45] <Gevox_> Sample code: http://pastebin.com/SuCBfrXA - more about cursors http://api.mongodb.org/java/2.6/com/mongodb/DBCursor.html
[10:07:55] <Gevox_> and now i will brb :)
[10:08:13] <vagelis> Yes but I thought that if I pass the .limit(1) then it cant iterate cause theres only one but I guess it doesnt care about limit
[10:08:23] <vagelis> its what find() returns
[10:08:46] <vagelis> thanks Gevox_
[10:13:26] <vagelis> wait, so Cursor does not start "counting" right? If i find for example 1 result from my query Cursor lets say it waits in -1 position and when i call hasNext(), Cursor goes to 0 position (First element)...right?
[10:34:31] <Gevox_> vagelis: there are various methdos inside a cursor, if you call hasNext() yes it walks step further from your current position (and it might be looking after a null valued) so it throws an exception that is already handled by Mongo for you.
[10:34:45] <Gevox_> hasNext -> Checks if there is another object available
[10:34:53] <Gevox_> next -> Returns the object the cursor is at and moves the cursor ahead by one.
[10:35:33] <Gevox_> The rest of cursor docs here -> http://api.mongodb.org/java/2.6/com/mongodb/DBCursor.html#hasNext()
[11:21:08] <okanck> Why the query is too slow: here is expain(): http://pastebin.com/UYP053rm
[11:22:56] <okanck> I created bounded index: {PlayerID:1, Time:-1}, but it didn't help
[11:23:27] <okanck> *compound
[14:06:33] <Haris> hello all
[14:07:56] <Haris> I need the 3.x version for production. is it possible to get it via a yum repo ? I was looking at the url ( http://docs.mongodb.org/v2.4/tutorial/install-mongodb-on-red-hat-centos-or-fedora-linux/ ). I used the yum repo info for x64. but I'm not getting any versions in yum search that are for 3.x. Everything revolves around 2.4.x or 2.6.x
[14:08:26] <cqdev> Any idea why sorting isn't working in this aggregation? http://pastebin.com/ekU1czun
[14:16:08] <Haris> what's the difference between mongodb and mongodb-org ?
[14:19:49] <cqdev> Any idea why sorting isn't working in this aggregation? http://pastebin.com/ekU1czun
[14:26:25] <ule> hi there!
[14:26:29] <cqdev> Any idea why sorting isn't working in this aggregation? http://pastebin.com/ekU1czun
[14:26:32] <cqdev> hello
[14:26:33] <ule> I've just installed mongod here on MacOS
[14:26:36] <ule> following this: http://docs.mongodb.org/manual/tutorial/install-mongodb-on-os-x/
[14:26:47] <ule> but.. now.. how can I set user/password to connect it?
[14:27:16] <ule> I've installed Robomongo Gui Client to access it but I'm receiving authentication failed
[14:27:23] <ule> the documentation could improve this step
[14:27:41] <ule> maybe after the installation instructions.. how to make a quick connect and test it
[14:27:52] <ule> to assure the installation was sucessful
[14:27:58] <ule> anyone can help ?
[14:28:09] <ule> there's is any default password?
[14:28:22] <cqdev> I just started using the client myself, so I'm not sure. there isn't a default user/pass for mongo
[14:29:16] <GothAlice> ule: There is no default authentication.
[14:29:44] <ule> GothAlice: so.. how can I connect using the CLI though?
[14:29:50] <GothAlice> ule: Just try connecting Robomongo to localhost, no auth, or just run "mongo" on the command line and it should connect. By default it is secured by not allowing external access.
[14:30:09] <cqdev> GothAlice: is it possible to use the aggregate $sort after $geoNear? It doesn't seem to be working for me.
[14:30:17] <GothAlice> ule: I actually recommend installing MongoDB via Homebrew: then it gets set up for automatic startup, if you choose.
[14:30:23] <ule> weird.. robomongo doesn't show anything when I click connect.. maybe because I don't have any database?
[14:30:28] <ule> ohhh I see
[14:30:32] <GothAlice> ule: Or it's not running.
[14:30:39] <ule> yeah it's running
[14:30:53] <ule> 2015-05-15T10:18:01.186-0400 I NETWORK [initandlisten] connection accepted from 127.0.0.1:65263 #10 (5 connections now open)
[14:31:12] <ule> GothAlice: I'll test the console one
[14:31:16] <ule> GothAlice: thanks!
[14:31:42] <ule> GothAlice: I have 15 millions of rows on my mysql.. I want to test the performance using mongodb
[14:31:42] <greyTEO> https://github.com/10gen-labs/mongo-connector/wiki/Installation#installing-as-a-linux-service
[14:32:04] <GothAlice> ule: The mongod server is likely still not running. "brew info mongodb" should output the commands needed to start and/or register it for startup with the machine.
[14:32:07] <ule> GothAlice: my problem is when I try seaches using LIKE %% on mysql.. I want to compare the performance using the sabe criteria on mongodb
[14:32:19] <greyTEO> Is that step possible when installing the package via pip?
[14:32:20] <ule> gotcha
[14:32:21] <cqdev> GothAlice: Nevermind I think I see why, $geoNear is limiting the result set so the sorting isn't sorting on the full set
[14:32:25] <ule> GothAlice: thanks !
[14:32:30] <GothAlice> ule: MongoDB isn't a SQL or relational database. Most comparisons will be invalid.
[14:32:42] <GothAlice> (If you treat MongoDB like an SQL database you're in for a bad time.)
[14:33:22] <ule> I just want to generate reports using select * form table where email like '%name%'
[14:33:34] <GothAlice> Well, that happens to be a very unfortunate query in MongoDB.
[14:33:51] <ule> because.. if I had to use exact keys to search things.. I'll keep using mysql + indexes
[14:33:52] <GothAlice> It requires regular expressions, and because it's a mid-text search (and not a prefix search) it can't use indexes.
[14:34:12] <GothAlice> But I've never, in my life, seen a mid-text search in e-mail.
[14:34:16] <GothAlice> (Not a real-world query.)
[14:34:21] <ule> it's just an example
[14:34:23] <ule> :P
[14:34:27] <ule> but.. thanks anyways
[14:34:34] <ule> really apreciate your support
[14:35:22] <GothAlice> ule: https://blog.serverdensity.com/does-everyone-hate-mongodb/ < If you're planning on writing a blog post to go with your research, prepare yourself. ;)
[14:36:58] <ule> GothAlice: lol
[14:37:00] <cqdev> if you are trying to first pull back results based on location (ordered by distance) then sort that result set by another arbitrary field, what would the "Mongo" way to accomplish this?
[14:37:12] <ule> don't worry.. I just want to decide when should I migrate to nosql..
[14:37:19] <ule> my database is getting bigger everyday
[14:37:35] <ule> I'll gonna reach 1 billion rows in 6 month
[14:37:47] <ule> *I'm
[14:38:28] <greyTEO> ule, I dont think db size should be the determining factor to move to nosql
[14:39:07] <GothAlice> cqdev: I'll just state that I have 28 TiB of data in MongoDB totalling billions of "documents" (mostly for the file chunks) — I'm using MongoDB because it's schemaless and I can assign arbitrary metadata to go with each file. It works perfectly for this use case.
[14:39:18] <greyTEO> if you are worried about search performance, you can pipe your data into something like ElasticSearch fro better search perfomace and keep mysql as the datastore
[14:39:40] <ule> greyTEO: yeah. that's my idea
[14:39:41] <GothAlice> Or, learn how to denormalize data to optimize queries.
[14:40:05] <ule> greyTEO: I just want to learn better the idea of using json.. learn this terms.. etc
[14:40:07] <GothAlice> (You can't think of MongoDB in the spreadsheet shapes SQL forces your brain into.)
[14:40:24] <greyTEO> in the end, mongo wont solve your problems just by using it. You will be highly dissappointed in it
[14:41:06] <GothAlice> (And have a high likelihood of writing a blog post that won't be factually correct. ;)
[14:41:18] <cqdev> GothAlice: I understand the need to denormalize schema-less data, I'm just trying to figure out how to pull back data based on location and then sort on the distance and another field efficiently. This was my query, http://stackoverflow.com/questions/30261761/mongodb-sorting-on-geonear-results.
[14:41:31] <GothAlice> cqdev: Er, sorry, meant that for ule.
[14:41:40] <greyTEO> ule, for searching I reccomend ElasticSearch or Solr(not very familiar with)
[14:41:41] <cqdev> GothAlice: I thought as much :)
[14:42:19] <ule> greyTEO: I tested elasticsearch some days ago
[14:42:32] <greyTEO> then keep mysql as your data store. Continuing to add indexes for search in MySQL is going to make your db size grown insanely fast.
[14:42:34] <ule> greyTEO: I like de idea of using the embedded restful server
[14:42:50] <StephenLynx> I am still to figure out what does elastic is for :v
[14:42:52] <GothAlice> cqdev: $geoNear sorts your results. Period. If you want a similar range query, but sorted on other criteria, you have to use $near: http://docs.mongodb.org/manual/reference/operator/query/near/#op._S_near
[14:42:54] <ule> greyTEO: I set up redis + rabbitmq to help me
[14:43:33] <ule> greyTEO: Question.. do I need to make a copy of my whole mysql database into elasticsearch to make it working properlly?
[14:43:42] <cqdev> GothAlice: can I use near with aggregation?
[14:45:24] <greyTEO> no you don't HAVE to. you only put the data in that you need to search by. And add an id field that maps back to MySQL rows. when ES returns it's results, you will have MySQL id's there. Should be a fast lookup then
[14:45:46] <greyTEO> you can even make the ES doc id's the same mysql row id's
[14:45:59] <greyTEO> stealing a page out of the mongo-connector book
[14:46:03] <cqdev> GothAlice: I think this is what I was looking for, it seems to work db.Vehicle.find({isActive: true, condition: {$in: ['New', 'Used']}, "loc" : {"$within" : {"$center" : [[26.243640,-80.265397], 50]}}}).sort({condition: -1}).limit(10)
[14:46:55] <GothAlice> ule: One of my favourite aspects of MongoDB is that it replaced rabbitmq, zeromq, redis, and memcache for me. Reducing complexity and infrastructure costs and maintenance requirements.
[14:47:49] <ule> GothAlice: are you saying me.. that I can get rid of my rabbitmq and redis and use only mongodb?
[14:48:10] <GothAlice> Capped collections provide queue functionality.
[14:48:13] <ule> but.. redis works different.. it storages things on memory right?
[14:48:33] <GothAlice> A capped collection that can fit in RAM will be in RAM.
[14:48:37] <GothAlice> (MongoDB uses memory mapped files.)
[14:49:33] <cheeser> ish
[14:49:34] <cheeser> :D
[14:49:47] <GothAlice> ule: Here are some slides from a presentation I gave on using MongoDB as a distributed task system, vs. the Redis-based Celery.
[14:49:48] <GothAlice> https://gist.github.com/amcgregor/4207375
[14:50:00] <GothAlice> It's… quite a viable approach.
[14:50:18] <StephenLynx> yeah, mongo is pretty much an actual implementation of all the hacks surrounding mysql integrated.
[14:50:44] <GothAlice> That makes no sense, StephenLynx.
[14:51:06] <StephenLynx> well, you had all those things surrounding mysql because of what it couldnt do.
[14:51:27] <StephenLynx> then mongo looked at these problems and actually solved them instead of adding bloat around mysql.
[14:51:38] <ule> GothAlice: really cool
[14:51:45] <ule> I like this channel guys
[14:51:52] <ule> you guys are very active..
[14:52:04] <ule> this is fundamental to maintain a community
[14:52:37] <ule> maybe that's one of the reasons why mongodb got all this attention from developers
[14:53:21] <StephenLynx> yeah, it fits a certain number of scenarios that wasn't quite explored when mongo came to be.
[14:53:30] <StephenLynx> weren't*
[15:17:18] <okanck> query is so slow even on indexed fields... http://pastebin.com/5f3EeCCh
[15:17:44] <okanck> is it normal 89sec for response time?
[15:18:50] <cqdev> is there a function in mongo for calculating the distance between two sets of coordinates? I need to be able to sort by asc on distance, then sort by another group of fields and be able to show the distance.
[15:19:37] <cqdev> The problem with using geoNear is you can't effectively sort that data by anything other than distance.
[15:27:09] <cqdev> is there a way to output the result of a function call as a projection?
[15:31:30] <StephenLynx> yes.
[15:31:42] <StephenLynx> you just return a json object with the correct fields.
[15:31:56] <StephenLynx> I do that.
[15:31:58] <cqdev> StephenLynx: do you need to use forEach()?
[15:32:02] <StephenLynx> no.
[15:32:34] <cqdev> StephenLynx: do you have an example? I'm trying to append to my result set the results of a custom function call
[15:32:54] <StephenLynx> se
[15:32:55] <StephenLynx> sec
[15:33:41] <StephenLynx> https://gitlab.com/hub1dev/hub1/blob/master/src/be/public/createComment.js#L233
[15:36:25] <cqdev> StephenLynx: I'm not sure I follow, I'm using this query: http://pastebin.com/bYgQu23V
[15:49:14] <HikaruBG> hi guys
[15:49:34] <HikaruBG> I have a case where I have to reference two Mongo DB Documents
[15:50:05] <HikaruBG> I have a User model, and I have also Client and Worker modules
[15:50:38] <HikaruBG> I would like both Client and Worker to have the User embedded in them (so they can use Login and Authorisation features)
[15:50:44] <HikaruBG> Authorization)
[15:50:53] <HikaruBG> ANy ideas how this can happen?
[15:51:06] <HikaruBG> (I am new at Mongo DB, by the way!)
[15:52:46] <StephenLynx> mongo does not ensure relational integrity neither does it support joins, so any relation between collection has to be done in application code, outside mongo.
[16:10:34] <pEYEd> how does the default memory signature of MongoDB compared to MySQL? I am in a low memory enviroment and MySQL is a monster.
[16:11:08] <StephenLynx> how much memory?
[16:11:18] <Zelest> No idea, but as a rule of thumb, database-servers like memory :)
[16:11:30] <Zelest> likes*
[16:12:53] <StephenLynx> I used to run a server with 512 ram with no issues, but it didn't had to sustain much load either
[16:19:42] <pEYEd> StephenLynx: I'm pushing 768 and MySQL is 90% of it with zero load.
[16:19:54] <StephenLynx> lol
[16:20:09] <StephenLynx> yeah, mongo will be fine with that much ram
[16:21:24] <pEYEd> its bad. and I have the same thing on 3 different servers. new installs. I tried tuning the memory down, any adjustment that would actually bring it down, crashed it out. I give up on mysql.
[16:24:37] <pEYEd> StephenLynx: Someone told me that Mongo was tuned for accepting serialized JSON objects in pretty much 'ready to index' format. Does this apply? I have about 20 massive instagram objects I need inserted into a blank DB (skating the db build structure of each) Does Mongo play friendly with JSON?
[16:25:19] <StephenLynx> mongo practically stores json.
[16:25:27] <StephenLynx> any valid json object is valid in mongo.
[16:25:34] <pEYEd> o
[16:25:36] <pEYEd> :)
[16:25:43] <StephenLynx> objects, array, strings, booleans, numbers
[16:25:54] <GothAlice> Well.
[16:26:07] <StephenLynx> plus some other things that are not on json specification, like dates and geolocation tools.
[16:26:08] <GothAlice> MongoDB uses BSON, a superset of JSON in binary form.
[16:26:12] <StephenLynx> that
[16:26:15] <GothAlice> http://bsonspec.org/spec.html
[16:27:13] <GothAlice> Pro tip: actually make use of the extended types, like sane ISODates, real integers, etc. ;)
[16:27:57] <pEYEd> GothAlice: no kidding :)
[16:29:07] <GothAlice> StephenLynx: You ran off after describing how you store dates yesterday, so I couldn't rib you for wasting 15 bytes on every date and losing the ability to use date projections. :^P
[16:31:47] <GothAlice> General question, though. BSON binary (type \x05) data with a user-defined subtype (\x80) — am I to safely assume that _all_ subtypes above \x80 are user-defined?
[16:31:57] <StephenLynx> yeah, I assume it isn't the optimal way for storing them.
[16:32:20] <StephenLynx> and what are date projections?
[16:33:20] <GothAlice> StephenLynx: http://docs.mongodb.org/manual/reference/operator/aggregation-date/ < these.
[16:33:38] <GothAlice> (Oh, also millisecond accuracy, timezone-awareness, etc., etc.)
[16:34:19] <StephenLynx> hm
[16:34:25] <StephenLynx> that is interesting.
[16:34:29] <GothAlice> They're missing weekOfYear, though. That's sad making.
[16:34:42] <StephenLynx> I will probably change when I have to deal with dates again
[16:34:43] <GothAlice> Oh, wait, no, they just simplified the name.
[16:34:44] <GothAlice> Phew.
[16:35:03] <StephenLynx> can I select an arbitrary format though?
[16:35:17] <GothAlice> It's an ISODate. You can get it back in whatever strftime format you want.
[16:35:22] <GothAlice> http://docs.mongodb.org/manual/reference/operator/aggregation/dateToString/#exp._S_dateToString
[16:35:41] <StephenLynx> noice
[16:35:51] <GothAlice> (Either server-side, as above, or client-side as I do. Again, it's faster to transfer a 4-byte date than the formatted string!)
[16:35:57] <StephenLynx> ah, it was introduced in 3.0.
[16:36:13] <GothAlice> That was, yes. I don't recommend using it for the reason given. ;)
[16:36:14] <StephenLynx> makes sense that I haven't heard of it until now. I didn't took my time to study 3.0 novelties.
[16:36:32] <GothAlice> The rest of those date projections have existed for a long time, though.
[16:36:54] <GothAlice> So old those other operator doc pages don't mention a version.
[16:37:02] <StephenLynx> what will my server output when I use JSON.stringify on a date?
[16:37:20] <GothAlice> That's simple enough to be worth testing out.
[16:37:32] <StephenLynx> will do.
[16:37:42] <StephenLynx> because what my clients get is what concerns me.
[16:38:09] <GothAlice> They get whatever you give them.
[16:38:22] <GothAlice> If you're JSON.stringifying it, you can do whatever you want to it.
[16:39:03] <GothAlice> StephenLynx: http://showterm.io/306a71722ffc95fe3b9c8 < often it's faster to try than to even ask the question. ;)
[16:39:40] <GothAlice> (That string is referred to as an ISO-formatted date. It's a standard, and handing that to a JS Date object constructor will Just Work™.)
[16:39:49] <StephenLynx> ah, yeah, I am not used to run js on mongo.
[16:40:36] <StephenLynx> does it always use 3 digits for the milliseconds?
[16:40:48] <GothAlice> Yes.
[16:41:05] <StephenLynx> then its all good.
[16:41:06] <GothAlice> (Again, that's a standardized ISO format.)
[16:41:07] <StephenLynx> will start using it.
[16:41:31] <GothAlice> (The T separating the date and time, and Z, for zulu or UTC, on the end are strong indicators of ISO date-ness.)
[16:42:13] <lfamorim> Someone how to save the output of db.collection.group in another collection?
[16:42:29] <lfamorim> s/Someone how/Someone know how/g
[16:42:33] <GothAlice> lfamorim: Use db.collection.aggregate with an $out stage.
[16:42:46] <lfamorim> Goopyo, aggregate is the only choice?
[16:43:07] <Goopyo> GothAlice: he meant you
[16:43:08] <GothAlice> Or map/reduce, but .group() is basically just the $group stage of a normal aggregate.
[16:43:11] <lfamorim> because $group is very limited
[16:43:13] <GothAlice> Goopyo: Yeah, I got that. ;)
[16:43:23] <GothAlice> lfamorim: How so?
[16:43:23] <Goopyo> :)
[16:43:35] <lfamorim> GothAlice, .group have callback functions
[16:43:50] <StephenLynx> hm
[16:43:52] <StephenLynx> so as everything.
[16:43:54] <GothAlice> lfamorim: Not really. That's an illusion caused by .group being implemented using map/reduce.
[16:44:33] <lfamorim> hum ....
[16:44:34] <GothAlice> To continue with the "run JS on the server" approach, see: http://docs.mongodb.org/manual/reference/command/mapReduce/ and the "out" option.
[16:45:13] <GothAlice> http://docs.mongodb.org/manual/core/map-reduce/ — heh, sorry, should have linked this, not the command reference.
[16:46:15] <lfamorim> GothAlice, http://pastebin.com/W8GLKV8f
[16:46:58] <GothAlice> So not using collection.group().
[16:46:59] <lfamorim> Do not know why it takes 500 hours to perform the map reduce running in ten threads, are 500 million records.
[16:47:29] <lfamorim> GothAlice, I think the group as a MapReduce alternative.
[16:47:35] <lfamorim> Thinked*
[16:47:44] <lfamorim> thought
[16:47:45] <lfamorim> rs...
[16:47:46] <GothAlice> Instead think "aggregate as a map/reduce alternative".
[16:48:05] <lfamorim> Can I get the entire record in aggregate?
[16:48:12] <lfamorim> An array of results in aggregate
[16:48:14] <GothAlice> .group() is just a shortcut for map/reduce.
[16:48:31] <GothAlice> Yes, by default aggregate queries return one "record" containing an array of the results.
[16:48:57] <GothAlice> (This means the entire result set needs to fit in one 16MB document, or you need to use $out to write it to a real collection.) You can also stream results.
[16:50:00] <GothAlice> Ref: http://docs.mongodb.org/manual/reference/command/aggregate/#dbcmd.aggregate and the allowDiskUse and cursor options.
[16:52:45] <GothAlice> http://docs.mongodb.org/manual/reference/method/db.collection.aggregate/#db.collection.aggregate provides some better descriptions and additional links to resources for those options (including tutorials).
[17:01:16] <lfamorim> GothAlice, thank you!
[17:01:18] <lfamorim> =D
[17:03:18] <cqdev> i think i've hit a wall with mongo, is there a way to sort on both distance and a field? It seems this is a limitation of the platform as if you use geoNear you get the closest 100 results, but those results aren't guaranteed to have the fields that you would need to maybe sort on by another field
[18:55:48] <cqdev> I was wondering if anyone knows of a solution to this problem: http://qnalist.com/questions/5227794/mongo-2-4-4-geonear-sorting-on-distance-first-and-then-by-name
[18:56:41] <cqdev> Essentially it seems since $near is not supported with sharding enabled, mongo does not have the ability to sort on distance and another field at the same time, is this correct?
[18:56:53] <GothAlice> 2.4. Any chance you can upgrade?
[18:57:00] <GothAlice> (Unrelated to the geo issue.)
[18:57:48] <cheeser> you could probably use an aggregation for that.
[18:58:24] <cqdev> cheeser: i've tried with GeoNear, but it first sorts by distance which could remove documents that would show up first by another sort
[18:59:22] <cheeser> so sort by those other first, then by geonear, then limit
[18:59:29] <cqdev> cheeser: http://pastebin.com/DUM31AxN is essentially what i'm trying to do, but adding the high limit slows down the query drastically
[18:59:46] <cqdev> cheeser: i thought that in an aggregate function the geonear has to be first?
[19:00:46] <cheeser> i don't know. i've never worked with the geo stuff so i'm not sure what restrictions it has
[19:01:56] <cqdev> GothAlice: that's basically the issue i'm running into, and using geoWithin doesn't sort anything and would force me to process and sort a massive collection of 300k+ records
[19:02:01] <GothAlice> The last geo assistance I provided revolved around switching from $geoNear to the "within a circular radius" form.
[19:02:26] <GothAlice> (But sorting by distance wasn't part of the requirement, on that one.)
[19:02:56] <cqdev> GothAlice: I tried that too, but I'm not sure how to effecitvely sort by distance. Here is the query I tried with that strategy: http://pastebin.com/RdRNjTUs
[19:04:41] <GothAlice> "As the crow flies" distance calculations on a spherical surface aren't actually overly hard. (The "circular radius" query documentation includes the formula to convert Earth radians to miles, for one example.) You'd need to find the distance between the two sets of coords (can be done in a $project aggregate stage), then $sort on that.
[19:05:15] <GothAlice> It'd be an estimate, of course. And progressively less accurate the longer the distance.
[19:05:23] <GothAlice> (But should work for sorting!)
[19:05:30] <cqdev> GothAlice: That's about where I'm at now, I created a function that provides the distance in miles using the Haversine formula, but I can't figure out how to apply that custom function to a sort
[19:05:40] <GothAlice> You don't.
[19:05:57] <GothAlice> You implement the forumla as a $project expression, to save the resulting value into the document, then $sort on that field.
[19:06:39] <cqdev> GothAlice: do you have an example of how that would work? (thank you immeensely for your help, I've been pulling my hair out!)
[19:07:48] <cqdev> GothAlice: it looks like you use $let?
[19:08:00] <GothAlice> To pass in variables, you can use $let.
[19:08:18] <cqdev> GothAlice: is it possible to call a function from within let? like db.eval()
[19:08:21] <GothAlice> No.
[19:08:25] <GothAlice> Not at all.
[19:08:28] <GothAlice> Not even close, evne.
[19:08:43] <GothAlice> You need to implement the formula within the MongoDB expression language.
[19:09:09] <GothAlice> You can't use an external library to feed server-side calculations in the way you are thinking.
[19:09:39] <cqdev> GothAlice: it's a function inside of mongo though, I normally call it like db.eval('distance(lat1, lon1, lat2, lon2)')
[19:09:59] <GothAlice> :|
[19:10:55] <GothAlice> However, no, aggregate queries explicitly avoid spinning up the JS engine. That makes them faster, and more secure.
[19:11:05] <cqdev> GothAlice: I was running out of ideas lol, we are using php with mongo and there is no way I can use php to do the distance calculations from the mongo return, it would be way to memory intensive
[19:11:32] <cqdev> GothAlice: so, basically I need to combine my formula into one line in the aggregate function and sort on the new value from $let?
[19:11:40] <GothAlice> cqdev: http://www.geodatasource.com/developers/php
[19:12:07] <GothAlice> Or just implement the PHP code I gave you within the scope of a MongoDB expression. ;)
[19:12:21] <cqdev> GothAlice: the problem is that I have to first return 500k+ records then sort it in php which would be very slow
[19:12:27] <GothAlice> Stop.
[19:12:30] <GothAlice> Not what I'm suggestion.
[19:12:33] <GothAlice> Suggesting, even.
[19:13:00] <GothAlice> http://docs.mongodb.org/manual/meta/aggregation-quick-reference/#aggregation-expressions
[19:14:50] <GothAlice> Notably, http://docs.mongodb.org/manual/meta/aggregation-quick-reference/#arithmetic-expressions remembering that sin/cos/tan are implementable using division and multiplication, and a little subtraction.
[19:15:29] <cqdev> GothAlice: Oh okay, your point (I believe) is to use the arithmetic expressions to create a new field in the $project stage, then sort on the new field after the project?
[19:15:30] <GothAlice> I'm suggesting you teach MongoDB how to calculate the value against the records for you.
[19:15:42] <GothAlice> Then it can sort on it, server-side.
[19:15:48] <GothAlice> Correct.
[19:16:24] <cqdev> GothAlice: ahh groovy, thank you so very much. We tried ElasticSearch, and OrientDB and both of them can't compare to the speed and support we've been getting with Mongo
[19:16:28] <GothAlice> … it won't be pretty, but it'd be about as optimum as you could get.
[19:44:07] <greyTEO> I think ES would be a good use case to geo queries with text
[19:45:33] <greyTEO> lol Im a big fan of ES though so I have a baised opinion though
[19:46:09] <cqdev> I've tried using ES, I like it but I'm not crazy about the HTTP only interface
[19:46:45] <cqdev> I don't have a lot of experience with it though so there may be a binary interface at this point
[19:47:24] <greyTEO> they have another interface but it's really only for the java client
[19:47:51] <cqdev> it seems that geo sorting with multiple columns is a issue for just about every db I've been testing over the past week
[19:48:20] <greyTEO> https://www.elastic.co/guide/en/elasticsearch/reference/1.x/search-aggregations.html
[19:48:40] <GothAlice> cqdev: Yeah, it's usually geo or nothing. :(
[19:48:42] <greyTEO> which is what I think GothAlice was referring to with mongo
[19:49:19] <greyTEO> you have to search by geo, then reduce down with filters then sort...
[19:49:21] <cqdev> I wish $near would provide the distance, if it did I would be set since we don't really need sharding
[19:49:28] <greyTEO> kind of a compelte task to do on th fly
[19:49:36] <greyTEO> all the time
[19:50:15] <cqdev> greyTEO: the problem is when you use geoNear though you are already filtering the data set, so really you need to $project first your data set then apply a $let to get the distance, then sort on the distance + other fields
[19:51:34] <greyTEO> why would you need to sort first if the documents are out of bounds (distance)?
[19:52:01] <greyTEO> they are either in or out, then you use sort to get the docs you want in bounds. Sorting wont change their distance.
[19:52:53] <greyTEO> im not familair with $let and $project so i might be missing something
[19:53:00] <cqdev> greyTEO: lets say we're sorting on a collection of vehicles, specifically the distance from a location and the price of the car. If you search by distance first then you will end up with a limited set of data (like 100 lets say) and then when you sort on the price it's not really the lowest price of the data set, it's just the lowest price of the closest 10)
[19:57:07] <greyTEO> Do you not want to filter out vehicles that do not match your criteria?
[19:57:43] <greyTEO> seems like you might be able to do this in 2 queries then merge the results.
[20:31:46] <MacWinner> GothAlice, how close are you to being comfortable with wiredtiger? noticed another mongo update yesterday
[20:32:15] <GothAlice> None of the major outstanding tickets for me have been closed yet.
[20:32:42] <GothAlice> (I.e. still outstanding memory, performance, and data loss issues.)
[20:34:01] <StephenLynx> 3.0.3, mac?
[20:34:38] <GothAlice> 3.0.x so far, Mac (kernel panicking Mach is a neat trick) and VMs on Rackspace managed by MMS.
[20:39:38] <StephenLynx> nah, I as was asking macwinner which version was the update
[20:40:08] <MacWinner> StephenLynx, yeah.. 3.0.3
[20:40:27] <MacWinner> GothAlice, ahh.. ok.. thanks.. just checking to make sure. lunch time!
[20:40:39] <GothAlice> Omnom.
[20:52:34] <Spec> :0
[20:55:16] <cqdev> GothAlice: So I tried the $let method but I can't see how to do the calculations with the expressions, you need recursion to calculate sin by hand
[22:46:53] <krazyj> does anyone know how i could instruct mongo to return a simple array of values, for a given find()?
[22:47:17] <krazyj> i’m selecting one field, and i just want an array of values… not a dict of keys w/ values
[22:55:51] <StephenLynx> if you just want to store everything in a dumb array, why are you saving data as objects in the first place?
[23:11:16] <krazyj> StephenLynx: hm? this is about results formatting...
[23:11:26] <krazyj> i want to grab one value from every result, and spit it out as an array of values
[23:11:30] <krazyj> not a dict of keys w/ values
[23:40:17] <st34lth> hello this is really odd, i'm able to auth using mongo -u -p etc, but using clients like mongohub, etc, it tells me auth failed
[23:40:22] <st34lth> what i'm I missing
[23:55:08] <fructose_> Is there a way to do an update so that when the _id is null (or some other non-existent id), and upsert is true, it'll automatically create an ObjectID on insert? I've tried using $setOnInsert, but that gave me "Mod on _id not allowed"
[23:56:38] <StephenLynx> afaik auto-creation of _id is default.